sunelehmann.com — intermittent updates

New Site

Thu, 30 Apr 2026 15:39:00 GMT

I hate to admit it, but I've been missing Twitter a little bit. I liked having a place to post little observations, thoughts, or dumb jokes. Somehow Bluesky isn't working for me. I like reading Bluesky - and I get useful info there, but I feel like I don't really know anyone there, so there's no one (or almost on one) to post to. And while LinkedIn now actually works for some things (announcing talks, new papers, etc), it's not right for the type things I miss about Twitter (silly things). LinkedIn feels kind of like everyone is wearing a blue shirt. So, as a consequence I've restarted my blog. ![credit ChatGPT 5.5](/assets/20260430_new_site/beginning.png) I don't really expect anyone to read my posts 😅 but since the blog is mine, at least it's a place where I can post the things I would have posted on Twitter. ### AI site In rebooting my site I also did something radical. I stopped using wordpress and had Claude Code design the whole thing for me (hosting it on GitHub). I also had Claude write almost all of the static pages without much supervision. I think it's pretty good ... even though it's a bit weird having Claude write stuff in my voice. **Importantly**, however, the blog is all me. Every single character. More to come, soon.

Martin Rosvall Talk

Fri, 27 Feb 2026 20:41:06 GMT

In the Social Complexity Lab at DTU/SODAS, we're lucky to have legendary network scientist Professor Martin Rosvall visiting soon. He will give a talk on March 10th at 14:00 at DTU (full talk details below). Martin is a brilliant network scientist and professor of physics with a focus on computational science at Umeå University, where he heads the Integrated Science Lab (IceLab) and its excellence center on modeling adaptive mechanisms in living systems under stress. His research develops mathematical models, algorithms, and visualizations that reveal structure and dynamics in complex systems—from information flow and citation networks to ecology and spreading processes. ![](/assets/20260227_martin_rosvall_talk/screenshot-2026-02-27-at-8.44.57-pm.png) And Martin is awesome in many dimensions: he's a deep thinker, he's someone who carefully crafts meticulous and mindblowing science communication. But he is probably best known (with Carl T. Bergstrom) for introducing the map equation framework for flow-based community detection in networks, and for the widely used Infomap approach built on those ideas. His work has been highly influential across network and data science, with publications and tools used broadly by researchers in many disciplines. Do not miss his talk! The talk details are - **Time**: March 10th, 14:00 - **Place**: DTU Compute Building 324, room 240 - **Title**: Predicting interactions in dynamic networks - **Abstract**: Predicting future interactions or novel links in networks is an indispensable tool with many applications across diverse domains, including drug repurposing based on genetic networks, money laundering detection in financial systems, and recommendation systems using transactional data. Among the many techniques developed for link prediction, those leveraging the networks' community structure have proven highly effective. For example, the recently proposed MapSim predicts links based on a similarity measure derived from the code structure of the map equation, an information-theoretic community-detection objective function that operates on network flows. The map equation benefits from Infomap, its fast optimization method widely regarded as one of the best network-clustering algorithms. However, we developed Infomap for static networks. While its stochastic greedy search algorithm excels at identifying reliable communities in network snapshots, Infomap cannot effectively integrate new data or adapt to smooth transitions over time in continuously evolving relational networks. This shortcoming raises a computational challenge: How can we equip the map equation framework with a computationally efficient optimization method to enable adaptive analysis of dynamic networks, leveraging evolving relational data to predict future interactions and uncover novel links in real time? This is a great opportunity, don't miss it!

Hi-resolution voting maps of Denmark

Thu, 18 Dec 2025 07:57:22 GMT

As part of a project on something completely different, the always brilliant [Louis Boucherie](https://www.linkedin.com/in/louis-boucherie/) generated these maps of voting in Denmark during the 2022 general election [1]. Maybe it's me, but I can't remember seeing voting at this resolution (parishes [2]) and I found resulting images surprisingly neat & interesting, so thought it was worth a post. ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.44.28-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.52.20-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.52.35-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.52.45-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.53.04-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.53.12-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.53.21-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.53.37-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.53.59-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.54.07-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.54.19-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.54.32-am.png) ![](/assets/20251218_hi_resolution_voting_maps_of_denmark/screenshot-2025-12-18-at-7.54.41-am.png) **References** [1] Data from [2] Note that parishes don't align perfectly with constituencies, so some values are interpolations. And if you're really interested there's a nice PDF with "infinite" zoom that I can send upon request.

Op-ed in Politiken

Tue, 18 Apr 2023 14:35:04 GMT

With the awesome [Anders Søgaard](https://anderssoegaard.github.io), I recently wrote an Op-ed in the Danish newspaper Politiken. The op-ed is paywalled, so I'm posting a little pre-print of it [here](https://sunelehmann.com/wp-content/uploads/2023/04/kronik_sl.pdf) for your reading pleasure.

Philosophy of the Predicted Human: Cesar Hidalgo

Tue, 24 Jan 2023 15:33:32 GMT

[Jonas L Juul](https://jonassjuul.github.io) and I are organizing a series of talks this year on the topic of *The Predicted Human.* Our first speaker is the amazing Cesar Hidalgo. Did I mention that Cesar is amazing? Let me give you some reasons. He's the the Director of the *Center for Collective Learning* which spans the Universities of Toulouse, Manchester, and Harvard. He's a highly cited scientist, famous for (among other things) the notion of economic complexity. He has written [several bestselling books](https://cesarhidalgo.com/books). There's a [TV show about him](https://www.inmyshoes.info). He started a [successful company](https://www.datawheel.us). The list goes on. But all this worldly success is not why you should care about coming to hear Cesar speak. You should come because Cesar is a fountain of creativity. He has more unusual and inventive ideas before breakfast than most people have in months. And he's brimming with enough energy and excitement to make the energizer bunny envious. In summary, there's no excuse not to stop by. **Title**: How time, technology, and language impact collective memory and attention **Time**: February 9, 10am **Place**: [Det Samfundsfaglige Fakultetsbibliotek](https://udenrigspolitik.dk/sted/det-samfundsvidenskabelige-fakultetsbibliotek-auditorium-1/). Gothersgade 140. ***Auditorium 1.*** **Abstract**: From writing to the web, humans have used communication technologies to enhance our collective memory. Yet, much of what was once popular is now forgotten. In this talk, I will present research exploring the roles played by time, language, and technologies on the dynamics of collective memory and attention. Using data on the attention received by biographies, scientific papers, songs, and movies, we will explore the universal decay of collective memory, the role played by languages in global fame, and the biases in attention and collective memory introduced by changes in technology. **Official bio**: César A. Hidalgo is a Chilean-Spanish-American scholar known for his many contributions to economic complexity, data visualization, and applied artificial intelligence. Hidalgo leads the Center for Collective Learning at the Artificial and Natural Intelligence Institute (ANITI) of the University of Toulouse. He is also an Honorary Professor at the University of Manchester and a Visiting Professor at Harvard's School of Engineering and Applied Sciences. Between 2010 and 2019 Hidalgo led MIT’s Collective Learning group. Prior to working at MIT, Hidalgo was a research fellow at Harvard’s Kennedy School of Government. Hidalgo is also a founder of Datawheel, an award-winning company specializing in the creation of data distribution and visualization systems. He holds a Ph.D. in Physics from the University of Notre Dame and a Bachelor's in Physics from Universidad Católica de Chile. His contributions have been recognized with numerous awards, including the 2018 Lagrange Prize and three Webby Awards. He is also the author of three books: Why Information Grows (Basic Books, 2015), The Atlas of Economic Complexity (MIT Press, 2014), and How Humans Judge Machines (MIT Press, 2021). Cesar is also an old friend from my time as a postdoc in Laszlo Barabasi's lab. To prove this connection is a BarabasiLab halloween-party photo from 2007 (from the author's personal collection) which shows the young Cesar (with Nick Blumm to his left and Pu Wang to his right). ![](/assets/20230124_philosophy_of_the_predicted_human_cesar_hidalgo/cesar.jpg) If the talk topic intrigues you, you can check out these papers if you're interested in hearing more - Candia, Cristian, et al. "The universal decay of collective memory and attention." Nature human behaviour (2019) - Ronen, Shahar, et al. "Links that speak: The global language network and its association with global fame." Proceedings of the National Academy of Sciences (2014) - Jara-Figueroa, C., Amy Z. Yu, and César A. Hidalgo. "How the medium shapes the message: Printing and the rise of the arts and sciences." PloS one (2019) - Yu, Amy Zhao, et al. "Pantheon 1.0, a manually verified dataset of globally famous biographies." Scientific data 3.1 (2016)

Tweets I would have written. December 19, 2022 edition

Mon, 19 Dec 2022 16:34:17 GMT

It's tough not to check Twitter, so I'm posting my random tweet ideas here: - So many clustering algorithms create division wherever they're applied. - Since stopping Twitter, I’m using RSS feeds instead. Kottke and Daring Fireball is all I’ve added so far and it works AMAZING. - Blown away by Messi yesterday, I was reminded of this glorious #fivethirtyeight data science piece from 2014 “Lionel Messi is Impossible”

Post versus Person

Fri, 14 Oct 2022 10:59:40 GMT

**The setup**. In this post I want to discuss the topic of filter-bubbles / echo-chambers. I think it's dangerous to define too specifically what a filter bubble is (for reasons which will become apparent below), but the intuition is some kind of algorithmic sorting that exposes specific groups of people to content that resonates with them (so democrats see democrat-stuff, republicans see republican-stuff, etc). **The two camps:** When I look at the literature, there seems to be two camps. - On one hand, there is amazing research [1,2,3] (explained with exemplary clarity in my recent interview with [Piotr Sapiezynsky](https://toolazy.buzzsprout.com/1753094/11090773)) which **a)** clearly reveals the mechanisms behind algorithmically sorted content on Facebook and **b)** explains how this sorted content leads to filter bubbles along partisan/racial/etc lines. - On the other hand there is another stream of research stating that the issue with social media is a kind of *breaking* of filter-bubbles (bubbles that exist in the first place). This one is illustrated below https://twitter.com/M\_B\_Petersen/status/1580210192146321408 (Actually it was the tweet above from my friend and collaborator Michael, that made me think about this whole thing.) **The paradox**. So what's going on!?! It's a paradox: There's incontrovertible evidence of filter bubbles. Yet the problem is that social media exposes us to *opposing* viewpoints (from outside the bubbles). That made me think: Is there a way that these viewpoints can both be true simultaneously? And upon reflection, I think there is. One of the things that came up in my [podcast talk with Piotr](https://toolazy.buzzsprout.com/1753094/11090773) is that it's difficult for democrats to target content to republicans for a surprising reason. It is difficult because it's highly painful to republicans to see Biden portrayed in a positive light (e.g. in political ads) that *republican users actually stop browsing Facebook sooner when they see such ads* (too much cognitive dissonance, perhaps). This leads to lost revenue from other advertisers. That's the essence of how the filter bubble works, in fact. (And it's the same for democrats and Trump, obviously.) But the research in Michael's tweet above suggests that there's a different kind "democrat content" that does not have that effect on republicans (and *vice versa*). In fact my hypothesis is that there is content from the "other side" which does not compel people to leave the platform. Stuff that does not result in cognitive dissonance. One candidate for such content could be statements that are particularly outrage-generating (and conforming with our negative image of the other side, hence no cognitive dissonance). Stated differently, perhaps there's not just partisan and non-partisan content. It's highly plausible to me that there are *some types* of partisan content that drive people away from their social platform of choice (displaying reasonable aspects of political opponents, discussing them in a positive light) and *other types* of partisan content (the more outrageous stuff) that keep people engaged. **Resolving the paradox**. This explains the apparent paradox. There is filtering, but the filter bubbles do not filter *all* partisan content. It just separates out the reasonable stuff and only sends the crazy stuff across political divides. Hence the title of this piece, **post versus person**. This sorting of content is only possible because social media is about *posts* rather than *people*. Posts are bite-sized opinions, factoids, etc. People are complex, nuanced, good & bad. In the off-line world, I am presented with the full person, not just highly selected parts of them. And (in my experience) even people I strongly disagree with politically are usually pretty reasonable in most of their opinions and behavior. Maybe the person I've know as a fun and friendly coffee-machine acquaintance for years will say something pretty outrageous one day, and I'll think "wow, that was weird and unexpected" ... and move on. But in the online world, I don't get the whole person. I don't get the reasonable and boring stuff. I don't get the history and context of the person. I just get the single crazy utterance, just the disembodied single post, because that's what'll keep me staring at the screen longer [4]. **What do you say?** My hypothesis around which types of content makes it across the partisan divide is totally testable. I hope someone will go and investigate! (Also, I haven't really done a literature study, so perhaps this idea isn't new at all. If it isn't, let me know!) We've long known that social media only rewards only extreme snippets with attention, but I think this filter bubble aspect is new: There is *some* partisan content that reinforces existing views in a righteous not-cognitive-dissonance-inducing way (that's the stuff that makes it across the divide), then there's the more nuanced stuff that causes us to have so see the other side in a different light - a list of the good deeds done by my enemy. That stuff we never see. Further, by only exposing us to the most extreme views of our neighbors, social media is making us forget all the mundane stuff we have in common with all people. By the way, realizing this issue even shows the outline of a solution: As we connect people online, on path towards a better system is one that establishes mechanisms that help us remember to see other people as whole beings - not just hot-takes. Let me know what you think ## References 1. https://dl.acm.org/doi/abs/10.1145/3359301 2. https://www.ieee-security.org/TC/SPW2022/ConPro/papers/ali-conpro22.pdf 3. https://dl.acm.org/doi/abs/10.1145/3514094.3534135 4. This view also explains why the old non-algorithmic Twitter somehow felt less insane. You were exposed to all tweets from entire people. There was still a selection, people want likes resulting in more extreme content, but at least you got the full person. With algorithmic sorting of maximally engaging content, we get rid of all the reasonable stuff that doesn't reinforce our existing beliefs.

Christoph Stadtfeld Talk

Wed, 13 Apr 2022 10:52:09 GMT

We are very excited to be joined by Christoph Stadtfeld on April 28th, where he'll discuss *The emergence of social networks and how they matter for individuals*. Christoph is an *Associate Professor of* *Social Networks* at ETH Zürich. His work investigates how social networks evolve over time and how individuals are affected by the emerging social structures that they are embedded in. The Swiss StudentLife Study that he conducted with his team from 2016 to 2020, for example, studies how friendship networks and social groups evolve among previously unacquainted university students, and how the emerging levels of social integration affect their well-being and academic success. He also develops statistical network models and computational methods to study social networks dynamics. ![](/assets/20220413_christoph_stadtfeld_talk/media_2FDvQrg14WsAEv5LJ.jpg%3Fname%3Dorig) And Christoph delivers truly award winning science. He received the Raymond Boudon Award of the European Academy of Sociology in 2018 and the Freeman Award of the International Network of Social Network Analysts in 2021. (full bio below) **Title**: The emergence of social networks and how they matter for individuals - **Date**: Thursday, April 28th, 2022. - **Time**: 13:00 - **Location**: Technical University of Denmark, Building 321, third floor lab-space (at the end of the hall). **Abstract**: Individuals have the deep psychological need to form ties with others. The relationships they build – who they become friends with, who they spend time with, who they talk to, who they approach for support – follow certain micro-level patterns. These are partly rooted in individuals’ preferences and partly in the opportunities that come with their existing social networks and environments. Individuals are in turn affected by their position in social networks, for example, through processes of social influence, or the access to social support and resources. Two important questions in social network research are thus, first, to explain how macro-level structures of social networks emerge from micro-level processes and, second, how these emerging, complex social networks affect the individuals they connect. In this talk I will discuss empirical research strategies that can help getting insights into these questions. From a methodological point of view I will discuss the utility of different data collection strategies, statistical data analysis, and agent-based simulations. The talk will further present empirical insights on the emergence of friendship ties and groups in social networks, and on the well-being and academic success of individuals. **Bio**: Christoph Stadtfeld is an Associate Professor of Social Networks at ETH Zürich in Switzerland. His empirical work investigates how social networks evolve over time and how individuals are affected by the emerging social structures that they are embedded in. The Swiss StudentLife Study that he conducted with his team from 2016 to 2020, for example, studies how friendship networks and social groups evolve among previously unacquainted university students, and how the emerging levels of social integration affect their well-being and academic success. He also develops statistical network models and computational methods to study social networks dynamics. Probably best-known in recent years is his work on actor-oriented models for relational event data (DyNAM) and corresponding open source software (the goldfish package in R). Christoph’s work appeared in sociological, methodological and interdisciplinary journals. He received the Raymond Boudon Award of the European Academy of Sociology in 2018 and the Freeman Award of the International Network of Social Network Analysts in 2021.

Much ado ...

Tue, 22 Mar 2022 07:43:59 GMT

So, recently, I won a Danish research award. A side effect of winning the award is that a bunch of Sune-material was created around it. For example, an official portrait ![](/assets/20220322_much_ado/9f9335e1-1d30-404f-9d5d-24d23e2ee500.jpeg) There is also a video of me doing "research-y" stuff. https://www.youtube.com/watch?v=DN2kl0LiqGc And there's even a little [written "portrait" of me (in Danish)](https://ufm.dk/forskning-og-innovation/forskningsformidling/eliteforsk/prismodtagere/prismodtagere-2022/sune-lehmann). I can't really put this stuff in my normal photo library (don't want to do that to the rest of the family), So now, I'm doing a little blog-post to have it all collected in a single one place. ![](/assets/20220322_much_ado/samlet_alle.jpeg) All the prize winners ![](/assets/20220322_much_ado/sune_award.jpeg) Getting the prize next to the minister and the Crown Princess ![](/assets/20220322_much_ado/sune_interview.jpeg) Interviewed on stage ![](/assets/20220322_much_ado/sune_interview2.jpeg) Hamming it up! ![](/assets/20220322_much_ado/line-1.jpeg) My hallmate from Valkendorf's Kollegium 2001-2004 Line Friis Frederiksen, was the MC on the day. Small world. Official photos from the day. The true highlight of the day, however, was when I had to go on stage. My last name is first alphabetically, so I was the first person on stage. Prior to the ceremony we had been carefully briefed about where to stand, members of the Danish Royal Family weren't shaking hands yet and much more. But what I hadn't anticipated is that the minister and the crown princess *had not* received that briefing. So when I got on stage and walked to my designated spot (where the Crown Princess gave me a warm handshake), there was a moment of confusion. This confusion was picked up by *[Billedbladet](https://www.billedbladet.dk)*, a Danish gossip magazine which subsequently ran a story entitled (my translation): ***[Confusion on stage: Crown Princess Mary helped get things under control](https://www.billedbladet.dk/kongelige/danmark/forvirring-oppe-paa-scenen-kronprinsesse-mary-hjalp-med-faa-styr-paa-tingene)***. If you click through, there's even a video of the "incident". ![](/assets/20220322_much_ado/zoom.jpg) Confusion on stage! So that's perhaps the real breakthrough: I made it to the tabloids 😅

Oh Twitter...

Sun, 09 Jan 2022 19:42:56 GMT

## Experiencing a Twitter take-down A few days ago, I was scrolling on Twitter when I came across a Twitter thread by Dr. Matthew Sweet (@drmatthewsweet) about Johann Hari's new book *[Stolen Focus](https://stolenfocusbook.com)*. I haven't read Stolen Focus, only the parts about our work. Those parts were completely fine, see full backstory below. I don't know if it counts as "viral", but the thread has 1.9K retweets and almost 5K likes at the time of writing. https://twitter.com/DrMatthewSweet/status/1479125910896975877 As I read the thread, I have to admit that I grew somewhat uneasy. Disturbingly (from my perspective), the thread uses ***our paper*** (with Philipp Lorenz-Spreen, Philipp Hövel, and Bjarke Mønsted) [Accelerating Dynamics of Collective Attention](https://www.nature.com/articles/s41467-019-09311-w) as a main example of a problem with Hari's book. Here's Dr. Sweet getting into it: ![](/assets/20220109_oh_twitter/screenshot-2022-01-09-at-10.37.17.png) He goes on: https://twitter.com/DrMatthewSweet/status/1479125940982763524 This is where it got even more troublesome. Sweet doesn't seem to understand the context of the quote and ends up completely misrepresenting the point we're making in the paper. We do say that the "phenomenon lacks a strong empirical foundation". But the point of that statment (which is from the abstract) is to suggest that the research we present in the paper, ***provides that very missing foundation.*** (I'm also a little bit offended that Sweet takes a potshot at us by stating that the paper "didn't make a big impact" [1].) ## So what's the criticism of our work? Ok. The next tweets contain the actual criticism of our paper: https://twitter.com/DrMatthewSweet/status/1479125944241696769 That's it?! The argument seems to be that if you're a "fellow historian", you won't like our approach. Perhaps, between the lines, there is also a claim that the datasets we study are not representative enough. Importantly, from substantive perspective, Sweet doesn't engage with what we're trying to do in the paper at all. And there are zero attempts to address the empirical behaviour that we are observing in the paper. Our careful, peer reviewed work documents systematic changes across time in a range of datasets. Somehow that doesn't matter. ## Collective versus Individual To be fair, in one subsequent tweet Sweet goes beyond criticising our work and points to a problem he has with the book. ![](/assets/20220109_oh_twitter/screenshot-2022-01-09-at-11.31.36.png) I fully agree that these two phenomena should not be conflated. For the record, we are very careful not to do that in the paper [2]. (And in the excerpt I read from the book, that was not a problem either. ) ## The point about Chaucer But regarding the part of the tweet above which mentions Chaucer's 9-day wonder, it once again is clear that Sweet hasn't understood what we're doing in the paper. The point of our paper is that this "9-day wonder" is a period of time that we show to be shrinking over time. And it looks like it has been shrinking for a long time. So whatever period Chaucer was talking about is much shorter today. Through careful empirical work, our paper documents precisely that evolution. ## So what now? The big question now is what we, as authors, should do in a situation like this? On one hand, I am proud of our paper. As explained above, even with my most generous reading, I find Sweet's criticism of our work to be loud, full of mistakes, and without almost any substance (good old [argumentum ad lapidem](https://en.wikipedia.org/wiki/Appeal_to_the_stone), cf [1]). And since thousands of people have read his criticism on Twitter, I'd like to publicly call bullshit on the whole thing. On the other hand, my intuition is that it's best just to stay out of Twitter fights [3]. Twitter is not the place for nuanced discourse. For now, I've chosen the option of writing this blog post. I think the post is long enough and navel-gazing enough that no-one will read it. (Sorry, for making so much out of a small thing, folks, but it is my blog after all.) It's also an opportunity to add some nuance to the discussion that's just not possible on Twitter. For now, this is my solution to have a defence of our work on the record, while not getting in Twitter fights ### **Addendum**: Backstory This whole thing started back in late 2019 I was contacted by Johann Hari who wanted to talk about a paper that I'd recently written in collaboration with Philipp Lorenz-Spreen, Philipp Hövel, and Bjarke Mønsted: [Accelerating Dynamics of Collective Attention](https://www.nature.com/articles/s41467-019-09311-w). The paper had come out earlier that year, and we'd all been doing a lot of press for it, so I didn't think much of it. Hari was in Copenhagen, he stopped by my office and we had a long, nice chat about the paper. Then, in January 2021, I heard from Hari again. He was hard at work on his book, and sent me the text of the part about our work (and the conversation we had) to read through. His email specifically noted "***It's very important to me to make sure every aspect of the book is entirely factually accurate so I wanted to send it to you so you could read it through***." I read the excerpt, and I found it to be a careful description of our work, with more detail than many other journalists would care to add. There were also a few sections of me speculating about reasons and mechanisms, but I found those sections to be clearly labeled as such. So I sent it back with a couple of small technical notes. Full disclosure: I was also a little bit flattered because Hari had turned me into a kind of living character in the book - a kind of hero, which was new and exciting. ### **Addendum**: I'm not alone in thinking that @drmatthewsweet is being disingenuous After writing the above, I came across a [Twitter thread](https://twitter.com/erikphoel/status/1479483074182324227) by [Erik Hoel](https://www.erikphoel.com) making very similar points. I've unrolled it below for the reader's convenience *Originally tweeted by Erik Hoel ([@erikphoel](https://twitter.com/erikphoel)) on [January 7, 2022](https://twitter.com/erikphoel/status/1479483074182324227).* 1/ How to grind an axe with a pop-sci book. You might have seen the viral thread of [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) implying that the author [@johannhari101](https://twitter.com/johannhari101) cannot "be trusted to handle data." https://twitter.com/DrMatthewSweet/status/1479125910896975877 2/ This is a great way to get some attention for yourself: pick a book, then find a way to say the author is misusing sources (one of the gravest authorial sins) when really you simply disagree with the original sources (a commonplace, boring objection that gets you no likes) 3/ [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) does precisely this switcheroo in his recent viral thread. Let's first look at what he says is the most "serious claim," the strongest evidence of [@johannhari101](https://twitter.com/johannhari101)'s inability to "handle data" (the second objection) https://twitter.com/DrMatthewSweet/status/1479125939208531969 4/ [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) implies the authors of the paper [@johannhari101](https://twitter.com/johannhari101) cites \*concede\* themselves that the hypothesis of shrinking cultural attention spans "lacks a strong empirical foundation." The implication is that Hari misuses the citation. ![](/assets/20220109_oh_twitter/FIgn8_NXMAI1ewN.png) 5/ Except the authors don't concede anything like what [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) is saying! That's a quote from the Introduction of the paper, when the authors talk about \*previous research\* and say "but so far" - implying their very paper \*is\* the empirical research that's needed. ![](/assets/20220109_oh_twitter/FIgosGYXsAIBTtn.png) 6/ Does [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) quote that? No. So in accusing another author of cherry-picking, he cherry picks, and not just some data, but a wild out-of-context quote that makes it sound like the conclusion of the study. It's not! It's the \*previous\* research. Here's the abstract: ![](/assets/20220109_oh_twitter/FIgpXmQWQAIU0dI.png) 7/ It's obvious that the paper says what [@johannhari101](https://twitter.com/johannhari101) says it says. That, of course, doesn't mean it's correct! It could be wrong. I haven't read [@johannhari101](https://twitter.com/johannhari101)'s book, I don't know him personally, and I'm pretty skeptical of pop-sci books lately: https://erikhoel.substack.com/p/why-do-most-popular-science-books 8/ But this was [@DrMatthewSweet](https://twitter.com/DrMatthewSweet)'s "most serious claim" and it's really just that he doesn't agree with the paper, NOT that [@johannhari101](https://twitter.com/johannhari101) misused it. To make make this point, [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) took a pull quote totally out-of-context. The Discussion is clear https://www.nature.com/articles/s41467-019-09311-w ![](/assets/20220109_oh_twitter/FIgtGB2XIAoSgAZ.png) 9/ The other claims are much smaller, like that [@johannhari101](https://twitter.com/johannhari101) cited a non-peer-reviewed study to make a point, and that he doesn't know that Facebook let's you find nearby friends. Frankly, they seem like smaller sins than [@DrMatthewSweet](https://twitter.com/DrMatthewSweet)'s misrepresentation of that pull quote. 10/ So it looks like rather than [@johannhari101](https://twitter.com/johannhari101) misusing sources, [@DrMatthewSweet](https://twitter.com/DrMatthewSweet) directly does so in his thread. Overall, it's just a substitution of a boring disagreement ("I don't like this book & the research behind it") with the exciting implication of authorial misconduct. *Originally tweeted by Erik Hoel ([@erikphoel](https://twitter.com/erikphoel)) on [January 7, 2022](https://twitter.com/erikphoel/status/1479483074182324227).* ## Notes [1] For fun, I clicked on [the paper's "metrics" page](https://www.nature.com/articles/s41467-019-09311-w/metrics) - which provides a measure of online impact. Here's what it states: ![](/assets/20220109_oh_twitter/screenshot-2022-01-07-at-16.13.11.png) I don't know what you have to do to "make a big impact" in Sweet's world, but I am personally of the opinion that being in the 99th percentile of *all tracked articles* of a similar age *in all journals* is pretty decent. And while we're nitpicking and speculating wildly. Half the author team is from Germany, so it's also factually incorrect to state that the researchers are "in Denmark". Probably it's my paranoia here, but how is it relevant where we're from!? Am I detecting a hint that research from Denmark should be taken less seriously? [2] As detailed in the "Backstory" addendum, I got to read the part of the book about our work, and there Hari is careful to talk about collective attention and not individual attention. [3] Also, it turns out that Hari is a [controversial figure in the UK](https://www.theguardian.com/media/2011/sep/14/johann-hari-apologises-orwell-prize). While my personal interactions with him have been great, a Twitter search reveals that a lot of people have an axe to grind with him. So I also worry that if I jump in to defend our work that somehow it will be confused with a defence of the book as a whole (which I still haven't read in full and thus don't have an opinion about), and which might direct the intense vitriol of the Hari-haters my way ... something that I am also not really interested in.

My favorite talk by me

Thu, 07 Oct 2021 13:41:20 GMT

On May 20th, 2014, I gave a talk at [Christiania](https://en.wikipedia.org/wiki/Freetown_Christiania) in Copenhagen. The talk was at a cool talk-series called "Science and Cocktails", and my talk had the title "[Complex Networks: Connections, Measurements, and Social Systems](https://www.scienceandcocktails.org/en/events/copenhagen/complex-networks-connections-measurements-and-social-systems)". To this day I still have fond memories of that talk. It was a packed room, in a stylish old derelict cinema, where - incidentally - the video for [this pop hit](https://www.youtube.com/watch?v=LHCob76kigA) (which I kind of dig, btw), was shot. And that night people actually liked my jokes. It's not so often you get laugh-out-loud reactions to a science talk. This one was my favorite experience of giving a talk ever. The talk itself is entirely non-technical, full of silliness, and contains a kind of history of Physics-y Network Science, plus in-the-thick-of-it results from our (then on-going) project [Sensible DTU](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095978). Now, a few days ago in early October 2021 [1], for reasons unknown to me, someone put the video of that night on YouTube. I haven't watched it (I worry a little bit that it won't live up to my recollection of it), but I'm posting it here so I have a way of finding it again. https://www.youtube.com/watch?v=H6So2XY5dGg ## Notes [1] Interestingly, it's been ***7 years*** since I gave the talk. And the pop video I linked in the post was the song "***7 years***" by Lucas Graham. Coincidence? Or proof that the Illuminati are running everything? We will never know.

Talk: Leo Anthony Celi

Fri, 13 Aug 2021 10:30:03 GMT

Hey Copenhageners, I'm happy to announce the first talk in a long, long while. We're extremely lucky to have Dr. Leo Anthony Celi from MIT kick things off with a talk about his work with *Machine Learning for health care*. Talk details are below - and if you keep reading, there's also more info about Leo's incredibly impressive resume and ongoing work (read more **here**). - **Time**: Wednesday August 25th, 13:30. - **Location**: Technical University of Denmark, Building 321 ,1st floor lab-space - **Title**: Ensuring machine learning for healthcare works for all - **Abstract**: The gaps in the medical knowledge system stem from the systematic exclusion of the majority of the world’s population from health research. These gaps combined with implicit and explicit biases lead to suboptimal medical decision making which negatively impact health outcomes for everyone, but especially those in groups typically under-represented in health research. Recent developments in machine learning and AI technologies hold some promise to address the issues with the generation of scientific evidence and human decision making. They also, however, have spurred concerns about their potential to maintain if not exacerbate these problems. These concerns must be aggressively addressed by adopting necessary structural reforms to ensure that the field is both equitable and ethical by design. **Affiliations**: Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Department of Medicine, Beth Israel Deaconess Medical Center, Department of Biostatistics, Harvard T.H. Chan School of Public Health **Short bio**: As clinical research director and principal research scientist at the MIT Laboratory for Computational Physiology (LCP), and as a practicing intensive care unit (ICU) physician at the Beth Israel Deaconess Medical Center (BIDMC), Leo brings together clinicians and data scientists to support research using data routinely collected in the process of care. His group built and maintains the publicly-available Medical Information Mart for Intensive Care (MIMIC) database and the Philips-MIT eICU Collaborative Research Database, with more than 20,000 users from around the world. In addition, Leo is one of the course directors for HST.936 – global health informatics to improve quality of care, and HST.953 – collaborative data science in medicine, both at MIT. He is an editor of the textbook for each course, both released under an open access license. "Secondary Analysis of Electronic Health Records" has been downloaded more than a million times, and has been translated to Mandarin, Spanish and Korean. He is the inaugural editor of PLOS Digital Health. You can read more about Leo's amazing work below: https://imes.mit.edu/research-staff-prof/leo-anthony-celi/

"Too Lazy": Episode 4 with Leidy Klotz

Tue, 27 Apr 2021 19:30:42 GMT

Our Episode 4 guest, Leidy Klotz, is a Professor at the University of Virginia. He studies the science of design: how we transform things from how they are - to how we want them to be. Leidy wants to apply his work outside of academia. He wants address climate change and systemic inequality, Leidy also works directly with organizations including the World Bank. Leidy has written more than 80 articles and two books. And Today we talk about his new paper (with a group of excellent co-authors) called "People systematically overlook subtractive changes" (1) which I love, and which recently made the cover of *Nature*. Below you can check out the YouTube version. https://youtu.be/OzxVGBXuZd4 We also talk about his new book, called “Subtract. The Untapped Science of Less” which unfolds some of his ideas on a much larger canvas (2). Leidy is a highly interesting person: Before becoming a professor - as we also discuss on the Pod - Leidy managed the design and construction of large engineering projects and before that he played professional soccer. I had an amazing time talking about this fantastic work! You can skip around using the timestamps below. - [0:00:00] Intro to today's episode - [0:01:42] Podcasts and simplicity - [0:04:00] How does one go from pro soccer to academia, teamwork, and more. - [0:15:50] The story of noticing \*subtraction\* - [0:21:22] From idea to scientific paper - [0:29:30] Can you always subtract? - [0:39:30] Subtraction & ressources: life, history, and evolution. [0:48:39] Learning to subtract in a world of plenty - [0:55:40] What about "free" addition: Harddrives, storage lockers, supporting materials - [0:59:30] What' next? #### Credits The podcast has theme music by Waylon Thornton. Songs are "American Heart" and "Seven". Via freemusicarchive.org and licenced under CC BY-NC-SA. The podcast was funded in part by the Villum Foundation. Intro waveforms by the VSound App. # #### References (1) https://www.nature.com/articles/s41586-021-03380-y (2) Find it here https://www.leidyklotz.com

"Too Lazy": Episode 3 with Dirk Brockmann

Mon, 19 Apr 2021 18:34:55 GMT

This episode’s guest is Dirk Brockmann. Dirk is a physicist and complex systems researcher. He’s a professor at the Department of Biology, Humboldt University of Berlin and the Robert Koch Institute, Berlin. Berfore returning to his native Germany, he was a professor at Northwestern University. https://www.youtube.com/watch?v=Xn8TWn5GTGg Watch it on YouTube, or - even better - subscribe where you normally get your podcasts ([Apple Podcasts](https://podcasts.apple.com/dk/podcast/too-lazy-to-read-the-paper/id1562667368), Spotify, Google Podcasts, etc) Dirk is a man of many talents. His academic work spans pioneering papers on human mobility and has also pioneered work on its connection to Infectious Disease Dynamics (there’s a super nice paper developing the idea of effective distance in Science a few years ago, but also many more). But that’s not all, he has many other papers, to give you a sense, he has a recent one on social networks of honey bees. Read all about him and his amazing group of researchers here: https://rocs.hu-berlin.de Finally Dirk has been a crucial voice of reason during the COVID-19 … dare I say becoming a bit of a celebrity in his native lands … all the while also making important scientific contributions on a number of aspects related to the epidemic. Now. Dirk is also a rebel, so he decided to rebel against the “rules” of the podcast. And submitted a paper to me that wasn’t even his own work. But experience tells me, that unexpected paths are often the best ones, so I decided to roll with it. And it sure was worth it. In the podcast, we talk about a review paper by Ilana Zilber-Rosenberg and Eugene Rosenberg concerning the “hologenome theory of evolution” (1). And it was glorious! Relentlessly and methodically, Dirk took me into an amazing world of little creatures I knew little about. And I also detected a beautiful political undercurrent, in this story of life and evolution as a massive collaborative and mutually supporting endeavor. Check out his hand-drawn illustrations here: https://twitter.com/suneman/status/1383359122700001283 If you love deep dives, this podcast is for you - [0:00:00] Intro by Sune - [0:04:08] We talk about headphone settings and friendship. [0:08:50] Why is Dirk a scientist? - [0:12:40] Patterns in Biology. - [0:16:15] Origin story. An anti-arrogance view of the world. - [0:25:10] We get started talking about the paper. The hologenome. - [0:33:33] An evolutionary theory that goes beyond the individual; beyond the concept of the species. It's about collaboration. - [0:43:00] Meddling with Nature. - [0:55:22] What about variability? - [0:59:30] The collaborating Aphid. - [1:03:10] The "Ship of Theseus" and adaptability - [1:15:16] The invisible Squid; too complex to be intelligent design. [1:19:20] One more thing. #### Notes (1) https://academic.oup.com/femsre/article/32/5/723/2398937 The podcast has theme music by Waylon Thornton. Songs are "American Heart" and "Seven". Via freemusicarchive.org and licenced under CC BY-NC-SA. The podcast was funded in part by the Villum Foundation

"Too Lazy": Episode 2 with Roberta Sinatra

Thu, 15 Apr 2021 14:18:53 GMT

Today is Roberta Sinatra day on [#TooLazyPod](https://twitter.com/hashtag/TooLazyPod?src=hashtag_click)!! Roberta is a physicist, an expert on science of succes, and all-round fantastic person. In the podcast, we talks about her recent paper ["Success and luck in creative careers"](https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-020-00227-w). https://www.youtube.com/watch?v=NqM2ayOqzqM Episode 2 is out! In the conversation, talk about a range of things and get deep into the process of creating science, the "backstage" part of things. If you want an overview of topics, the podcast now has time-stamps. - [0:00:00] Sune shamelessly fishing for compliments about the podcast trailer. - [0:02:19] We talk about Roberta's past work, ending up in Boston and working on success. - [0:10:29] Why Roberta is a scientist, the experience of working in BarabasiLab. - [0:20:20] How did the question of \*Luck\* enter into Roberta's research? - [0:32:08] The difficulty of finding the right question. [Being in "the cloud"](https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-020-00227-w). - [0:37:00] A solution begins to present itself. The Q-model. - [0:44:53] Decomposing success into skill and luck. - [0:48:47] Data sources, proxies (+ power-laws vs log-normals). [0:55:45] Sune's mind is blown learning about multiplicative luck (3). - [1:03:30] Getting to the results of the paper! - [1:14:07] How is talent inherited? - [1:19:54] Wrapping up!

Too Lazy to Read the Paper. Episode 1

Tue, 13 Apr 2021 18:20:55 GMT

The first episode of my podcast *Too Lazy to Read the Paper* is out now! https://youtu.be/UmWCe74QfRQ It's finally out! This inaugural episode features physicist, urban planning, human mobility and transportation scientist Marta C. González from UC Berkeley explaining the long and winding road to her paper *The TimeGeo modeling framework for urban mobility without travel surveys* [1]. In the podcast, we take our time, tracing Marta's career from Venezuelan graduate student, to postdoc in Germany, Notre Dame (US), and Boston. We hear a bit about what it's like to be a physicist at MIT's transportation department ... and how all those things shaped Marta's research and the paper we're discussing. If you're checking out this podcast to get an easily digestible version of your favorite paper, you're out of luck. The goal here is more like the directors commentary of a movie ... rich with detail about the person behind the paper. And it seems like the plan worked, check out these replies :) https://twitter.com/jlmateos/status/1381783945784610817 https://twitter.com/fede7j/status/1381680328641810434 You can also get it as an audio-only podcast here [https://toolazy.buzzsprout.com](https://t.co/zJ3G0XKUaA?amp=1) (I'm working on getting it on iTunes and all that stuff, but it's not quite there yet.) #### References [1] https://www.pnas.org/content/113/37/E5370

Big data vs the right data: Thoughts on a recently competed trilogy

Sun, 04 Apr 2021 16:36:41 GMT

Along with a superb group of coauthors ([Andreas Bjerre-Nielsen](https://abjer.github.io/), [Valentin Kassarnig](https://www.linkedin.com/in/valentin-kassarnig-5330a3124/), and [David Dreyer Lassen](https://daviddlassen.github.io/)), I recently published *[Task-specific information outperforms surveillance-style big data in predictive analytics](https://www.pnas.org/content/118/14/e2020258118)* in PNAS. I am very excited about this paper, which is the conclusion to our trilogy of "learning analytics" papers, based on the [Copenhagen Networks Study (CNS) dataset](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095978). And in my own humble opinion, it's a fine series conclusion [1]. What we found was deeply surprising to us and while focused on predicting academic performance, the paper makes a couple of important general points, which I discuss below. Plus it's short! The two first papers in the trilogy are: - **Part 1**: [Class attendance, peer similarity, and academic performance in a large field study](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0187078) (PLOS One, 2017). In this first part of the trilogy, we develop methods for measuring class attendance using behavioral data and quantify how attending class impacts our predictions of academic performance. - **Part 2**: [Academic performance and behavioral patterns](https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-018-0138-8) (EPJ Data Science, 2018). In the follow-up, we use class attendance - and all the other rich data we collected about students during the CNS project face-to-face contacts, online social networks, telecommunication, mobility, demographics, personality, etc) to predict academic performance. On of the key results from that paper was that, based on these rich data, we could predict performance on the level of state-of-the-art within learning analytics (=quite well). ![](/assets/20210404_big_data_vs_the_right_data_thoughts_on_a_recently_competed_trilogy/boot_loading.jpg) **Figure 1.** Flashing custom software on the smartphones used for the CNS study back in 2013. We did that for 1000 phones! For the conclusion of the trilogy we wanted to do something special. So we moved the features from the CNS study to [Statistics Denmark](https://www.dst.dk/en) (DST) [3] in order to see ***how much* more accurately *we could predict outcomes if we merged the highly detailed CNS data with the comprehensive registry data from DST***. We expected the combination of behavioral data and registry data to yield even better results than the behavioral data alone. And that is indeed what we found. But what completely took us by surprise is that ***the behavioral data basically didn't add anything to the prediction***. You can see this in Figure 2 (Fig 1A in the paper) below. ![](/assets/20210404_big_data_vs_the_right_data_thoughts_on_a_recently_competed_trilogy/screen-shot-2021-04-04-at-14.51.45.png) **Figure 2:** Predictive performance of the various data-types considered individually and together. The *x*-axis shows the balanced accuracy, with the vertical dashed line indicating the baseline of random guessing. "Big data" is all the detailed behavioral data we had collected from the CNS study and written about in Part 1 and 2 of the trilogy. "Administrative data" is the registry data which contains information about past grades as well as sociodemographic information about parents. The top line of the figure (red violin) shows the performance of behavioral data alone, the second line (blue violin) shows administrative data alone, and the bottom line shows performance when both data-types are combined. As is clear from the plot: When comparing behavioral data with the registry data head-to-head, we clearly get the best prediction from the administrative alone. And when we combine the two, the behavioral data does not add any additional information! (And at this stage I remind the reader that in Part 2 two of the trilogy, we showed that the behavioral data is absolutely on par with state-of-the-art in terms of predicting actual performance). ## What's going on!? In the paper itself, we dig deeper and show that the thing that really matters in terms of making accurate predictions, is how well students did academically before starting university. And that finding helps us make sense of the (surprising) result above. What matters for accurate prediction is access to a measure of ***task-specific performance***. If you what to know how well a student is going to fare at an exam, then information about their performance at a previous exam will be highly valuable. As we show in that paper, that information is much more relevant that information about your social network, friendship circle, personality, and so on. The analogy that I like use is the 100*m* sprint. Let's say you find a person and want to predict how long it will take them to run 100 meters. As I see it, the [surveillance capitalism/](https://www.theguardian.com/books/2019/oct/04/shoshana-zuboff-surveillance-capitalism-assault-human-automomy-digital-privacy)data exhaust/digital breadcrumbs approach is similar to sending a person to the gym and measuring how much weight they can lift in bench press, squat, dead lift, and so on. Perhaps also measure their BMI, body fat percentage, VO2max, and take some of blood panels. This kind of rich information tells us a lot about a person's general fitness. And based on this type of data, we can make many very different (but mediocre) predictions. We could probably do a reasonable job predicting their lifespan, guessing their age, estimating how many pull-ups they could do, what kind of diet they eat, etc Based on these data, we could probably also do an acceptable job at predicting how fast that person can run 100*m*. But my bet is that ***information about how fast their most recent time running 100m would be a much more valuable piece of information***. That's what we mean by "task-specific information" in the paper. And I think that's why we find that the most useful information in the case of academic performance ... is past academic performance! ## What does this mean for data science? The first implication for data science (which we also mention in the paper) is that our finding provides yet another argument that for every practical application, we should carefully consider, if we really need to collect all that data in the first place. In the case of learning analytics specifically, we make the point that it might not be necessary to install an app on every single student's phone and monitor their every move in order to identify the students in need of extra help or support. One could simply take a look at high-school transcripts (submitted during the application procedure) - or simply ask the students! The more general version of this is expressed beautifully in the paper (formulated by co-author David Dreyer Lassen), so I'll just quote it here: *The privacy–utility tradeoff ( ... ) posits that predictive ability from personal data is inversely related to privacy preservation. While generally true within a given dataset, this approach neglects the possibility that other data, possibly from different sources, on the same set of individuals may have a superior predictive ability for a given, or even more favorable, level of privacy. We argue—following the logic of prediction contests, where **new candidate models are compared against the best possible alternative rather than a benchmark of zero predictive ability**—that we should compare the predictive ability of different datasets, with different levels of granularity and potential privacy implications, to make more-informed choices about prediction/privacy tradeoffs. This insight is particularly important for characteristics or behaviors that are more stable over time and for outcomes where past task-specific information is available.* [my emphasis] The second implication is a bit more fuzzy. A sharp formulation of the exact issue is still not really clear in my mind yet. But somehow this feels to me like a new kind of frontier among the many complex issues related to Big Data (e.g. among privacy, biases, transparency, security, and so on). The best way I can express it, is that the issue is related to something we could call *"relevantness"*. Big datasets can be used to predict many different things about the people in those datasets. And it is true that sometimes we can wring surprising insights out of behavioral data. Famously, for example, we know that it is possible to estimate political leanings, personality and more, based on [Facebook likes](https://www.pnas.org/content/110/15/5802). But what we argue in our new paper, is that there's also a limit to these big datasets. It's not magic, and there are things we cannot learn just because we have access to large scale behavioral data. We can't necessarily estimate your risk of heart disease based reading habits, and so on. Here, I think coming back to the analogy of physiological measurements is useful. The big behavioral datasets allow us provide mediocre answers many different questions, but for each specific question we have, there are better ways of getting accurate answers. And that's the new trade off we need to also think about: Is the information we're interested even available within the surveillance-style datasets we're collecting? Stated borrowing some terms from linear algebra: Is it even possible to express direction we're interested in based on the basis vectors spanned by our surveillance data set? Or something like that. That last part got a big rambling, but if you're still here, let me know if this makes sense to you. I'll be thinking more about this in the future and hopefully my thoughts on this last aspect will get clearer with time :) ## Notes [1] I honestly think that it's "Avengers Endgame"-level. By this I don't mean that the paper is as good as *Avengers Endgame*. I just mean that *Avengers Endgame* was a satisfying end to the Avengers quadrilogy [2] - and in the same way this paper is a satisfying end to our paper trilogy. [2] And yes, I am using the word "quadrilogy" to describe a 4 part series even though I think it is a ridiculous word. A better term for a four part series is "quartet" (e.g. as in Lawrence Durrell's *Alexandria Quartet*), but that's clearly too pretentious for the Avengers Series. And I'm not a fan of the term tetralogy. [3] Statistics Denmark is the central authority on Danish statistics. Their mission is to collect, compile and publish statistics on the Danish society - and so they have an amazing dataset on all Danes, including lifelong academic performance, income, health, you name it.

Podcast Trailer: Too Lazy to Read the Paper

Tue, 23 Mar 2021 10:04:02 GMT

I've made a silly teaser trailer for the first season of my science podcast project "Too Lazy to Read the Paper". The setup is a video call where the author explains a paper to me. We can use screen-sharing, for figures, etc. We'll record the call and post to YouTube. Possible participants are authors of a paper in network science or data science. The intended audience is PhD students, PostDocs and other scientists. And it's of course also fine to talk as a group of co-authors! The idea is to start with a bit about you, the idea for the paper. Then talk about the research itself. And we'll end by gossiping about the reviewing process, etc. (The whole thing is based on the idea that papers are so formal. And that when two people talk to each other informally, it's often more fun - and tends to get ideas across more effectively). https://youtu.be/b9M6H\_FMMyM

Video from talk at University of Exeter

Fri, 29 Jan 2021 18:49:24 GMT

A couple of days ago I gave a talk at University of Exeter. In the talk I talk about the take on higher-order structures in networks that I developed a couple of years ago - and which I'm still excited about. Check it out below: https://www.youtube.com/watch?v=i2LQRSJ1G7w

Let's build networks of science-friends!

Fri, 06 Nov 2020 12:20:42 GMT

**Conference fatigue**. I'm realizing that conferences are not really working for me at the moment. In spite of heroic efforts from conference organizers, super-star speakers, etc. When experienced through the screen in my spare bedroom, it's all turning into a uniform, gray blur. And I generally don't enjoy giving talks into a matrix of empty, muted windows. **There is a deeper problem:** But who cares about privileged old Sune? It has been argued - and I strongly agree - that *the key victims of conferences moving online are the young researchers*. The PhD and PostDoc level. By not attending in person, the young researchers are robbed of ways of building networks of "science friends". Other young researchers in other labs and countries that they'll need as their careers progress. (To this day, many of my best science-friends are people I met at NetSci conferences during the aughts). And this is also a loss for Science more generally. What will our fields look like as these networks are no longer allowed to grow. **Restoring network-building for young researchers in the age of COVID-19 lockdowns**. So maybe we should work to set up structure to run talks/conferences in a way that favors the young researchers. Let's build networks of science-friends! I've thought about this for a while. And gotten good ideas from the actual young researchers in my group. Here's the general idea: - For increased engagement, we need to shift the focus to discussions rather than talks. So we need small groups. - I would still make it about presenting papers. *But we need to do it in a way where people don't just surf the internet during talks*. For that we need small groups. And lots of discussion afterwards. - It's about letting friendships grow. For that we need small groups. And lots of discussion. - Conferences are also about "being seen" by senior researchers in the field, so maybe we should put one senior person in each little group. **The nitty gritty**: To make this concrete, I would 1. Start by collecting a long list of interested young researchers. 2. Get research keywords from everyone. 3. Group people into small groups of 8, with groups based on shared interests, but forming links across labs, institutions, and countries 4. Groups could meet weekly for 1-2 hours. 5. At each meeting there would be 2 short talks, then discussion. After 4 weeks, the group would be done. And we could do multiple runs to foster more connections. Build more possibilities for friendships. So everyone would have a crew to have dinner with once the real conferences resume. (How I miss those!). **Looping in the seniors**: But wait a minute!!! Another important aspect of conferences (from the perspective of young researchers) is to meet famous people from their field. To see those authors of great papers in person. Thus, to supplement, we could **also** gather a list of senior researchers, who would volunteer to sit in on the talks, provide a bit of initial feedback and get the discussion started. The oldies should rotate every week so that the young people would have a chance to hear from the max number of different senior people. **Improving**: Let me know what you think! If people like this idea, maybe we should get something like this going? I'm especially interested to know. - Do you have any ideas for improvements of this format? - If you're a young researcher, does this sound interesting to you? - If you're a senior researcher would you be up for joining as a mentor? (Thanks to the researchers in my group for feedback on an early version of this post)

Contact Tracing

Sun, 26 Apr 2020 11:56:33 GMT

**Update May 23rd**. Final update for this post. I'm happy to report that the Danish government - in part based on input from our advisory board - has decided to base the Danish contact tracing app on the DP-3T (as implemented by Google and Apple) framework. Details here https://www.sum.dk/Aktuelt/Nyheder/Coronavirus/2020/Maj/Politisk-aftale-om-frivillig-smittesporingsapp-for-covid-19.aspx **Update May 1st**. Here's another update. It also turns out the advisory board wasn't officially announced until today. I hope the unintended embargo-breaking below won't have any serious consequences. Here's the official press release: see also [here](http://sum.dk/Aktuelt/Nyheder/Coronavirus/2020/Maj/Smittestop-app.aspx). **Update: April 27th, 2020.** *Well, it turns out I was wrong about Reason 1 below. And that I have lousy timing in writing blogposts with declarations in them.* *Pretty much right after finishing the post below (original post was written on April 26th), I was invited to be a member of a newly established Advisory Board for the officialDanish Contact Tracing App. The goal of the advisory board is to provide advice/feedback for the Agency for Digitization and the Ministry of Health regarding the App.* *Since I care a lot about doing digital contact tracing in the right way (with respect to privacy & epidemiological relevance), I accepted the offer to join. Clearly that changes some aspects of what I've written below.* Over the past few weeks, I've commented on contact tracing in the Danish press. But starting today, I am going to stop chiming in on this topic. Reason 1 is that I don't feel like I'm making much of an impact in terms of where the official app is going (which was my main reason to start speaking about this in the first place). Reason 2 is that the news on this topic are moving fast and staying updated is exhausting. Reason 3 is that engagement in this topic is drawing me away from the deep & long-term work that I want to be focusing on. And finally, Reason 4 is that contact tracing connects deeply to projects I did years ago, so it feels a bit removed from my current core scientific work of network science and mobility modeling. Thus, I'll be doing my best to pipe down from now on. If you want updated info about contact tracing in Denmark, I recommend checking out [Henrik Moltke's twitter feed](https://twitter.com/moltke). For the overall (epidemiology and privacy) view on contact tracing, I personally follow [Marcel Salathe at EPFL](https://twitter.com/marcelsalathe). Here are my two key points on contact tracing: - A decentralized protocol (such as DP-3T) is the best way to do things. - Contact tracing is not necessarily the best way to counteract spread of COVID-19. (For example: Re-focusing all the resources we're currently using on building apps towards getting everyone to wear masks, might very well be a better strategy.) Below, I provide some context on each of these points, referring back to what I've already said in the press. ## Privacy preserving contact tracing In the beginning of the epidemic, I supported (and made a small contribution to) the PEPP-PT initiative, which was then more of an umbrella organization, also including DP-3T. See [this press release](https://www.dtu.dk/english/news/2020/04/ny-app-skal-afdaekke-smittekaeder?id=4eaab1c6-ccc5-4c29-ac99-22e126d4555c) from DTU. At some point there was an internal argument (which I know nothing about), resulting in a split into a centralized (still privacy preserving) paradigm embodied by PEPP-PT and a decentralized paradigm represented by DP-3T and also incorporated in the Apple/Google solution to contact tracing. I have provided continuous updates on my attitude towards this issue via Twitter. See below ![](/assets/20200426_contact_tracing/screenshot-2020-04-26-11.24.21.png) I'm placing an image instead of embedding the tweets because I can't figure out how to display the thread. Link here ![](/assets/20200426_contact_tracing/screenshot-2020-04-26-11.17.33.png) Again, I'm putting an image here instead of an imbed - since I can't figure out how to embed the thread. But you can find it here: Here's a [link to a nice report on issues with centralized solution used in PEPP-PT](https://github.com/DP-3T/documents/blob/master/Security%20analysis/PEPP-PT_%20Data%20Protection%20Architechture%20-%20Security%20and%20privacy%20analysis.pdf). This report, written by members of the DP-3Tproject, came out shortly after PEPP-PT finally provided open source code. ## Is contact tracing using cellphones even a good idea? An important point to make when it comes to contact tracing is that it might not be a very good solution. Many of the issues are discussed in [this Twitter thread](https://twitter.com/vedransekara/status/1251160367801094144) by UNICEF Machine Learning Lead & Principal Researcher Vedran Sekara. I've also talked about these issues in the Danish Press. - *Version2* (April 20th, 2020): [DTU-professor advarer: Lad være at forvente for meget af corona-apps](https://www.version2.dk/artikel/dtu-professor-advarer-lad-vaere-at-forvente-meget-corona-apps-1090441). This article provides has details on potential technical issues related to how we convert info on Bluetooth scans to an understanding of epidemiological interactions. - *Information* (April 18th, 2020): [Apple og Google lancerer værktøj, som vil kunne spore spredning af COVID-19](http://Eksperter er ikke specielt bekymrede for vores privatliv https://www.information.dk/indland/2020/04/apple-google-lancerer-vaerktoej-spore-spredning-covid-19-eksperter-specielt-bekymrede-vores-privatliv). Here, I discuss the Google/Apple solution and how we leak data in many different contexts. - *Weekendavisen* (April 3rd, 2020): [Fagre nye færden](https://www.weekendavisen.dk/2020-14/samfund/fagre-nye-faerden). Talks about surveillance more generally.

Achievement Unlocked!

Wed, 13 Mar 2019 09:13:05 GMT

Starting march 1st this year, I was promoted to full professor at DTU! Pretty exciting. And if you want proof, take a look at my [updated profile page](https://www.dtu.dk/english/service/phonebook/person?id=25317&tab=1) (conveniently screen-shot and marked up below). ![](/assets/20190313_achievement_unlocked/screenshot-2019-03-09-15.10.24.png) Hmm. And I should figure out how to get a new photo - this one is from 2009.

Arek Stopczynski Visit

Sun, 30 Dec 2018 13:33:44 GMT

We're lucky to have Lab Alum Arkadiusz 'Arek' Stopczynski visiting the lab on January 9th and 10th. On the 9th, he's busy being examiner at a PhD Defense, but on his second day in Denmark, he's going to give a talk to tell us about what he's been up to since starting at Google af couple of years ago. In addition to working with us at DTU, Arek has also been a postdoc in Sandy Pentland's lab at MIT's MediaLab, he was an integral part of building the world's first mobile brain scanner, and he's [given a great TEDx talk](https://tedxbeaconstreet.com/speakers/arkadiusz-stopczynski/). And did I mention that [we've just put out yet another paper together](https://www.nature.com/articles/s41598-018-36116-6). Below, you can find the talk details: - **Date**: January 10th, 2019 - **Time**: 13:30-14:30 - **Place**: DTU Building 321, in the first floor lab space - **Title**: Data Science: Thinking Industry - **Abstract**: The practice of Data Science involves employing different methodologies, techniques, and tools, both deconstructive and constructive. In this talk we will discuss some fundamental differences in how Academia and Industry (exemplified by large tech companies) approach teaching and applying Data Science. These differences have important implications for how we teach students and conduct research. I hope you can make it. Arek will stick around afterwards if you'd like to chat and hang out.

Complex Networks in Cambridge

Sun, 16 Dec 2018 10:43:14 GMT

I had an absolutely wonderful time at the [Complex Networks 2018](https://www.complexnetworks.org) conference last week in Cambridge, UK. I learned a lot and got caught up a bit with all the amazing work that's going within complex network analysis and see some of the great new young researchers in the field. At the community detection sessions, I also saw several talks that drew on [our work on Link Clustering](https://www.nature.com/articles/nature09182), expanding and building on those ideas. Now don't get me wrong: That work is well cited, so I know people have been reading it. But my sense is that most of the citations are of the type "This is also something one could do" or from people applying the algorithm. Those are both great (and a much better fate than what befalls most of my papers), but it is still extra exciting to see people adopting, refining, and developing the ideas - using them for their own work with community detection methods! https://twitter.com/suneman/status/1073184265528139777 Another exciting development was to see how lots of people are starting to apply machine learning (including embeddings, etc) to networks. Finally, I also got to give my own keynote about our recent paper on the [Chaperone Effect in Scientific](https://www.pnas.org/content/115/50/12603) [P](https://www.pnas.org/content/115/50/12603)[ublishing](https://www.pnas.org/content/115/50/12603). It was a brand new talk (since the paper just came out 2 days prior), but judging from the Twitter reaction, people liked it :) https://twitter.com/lajello/status/1073263174063976449 https://twitter.com/LeonieMueck/status/1073265309568679942 https://twitter.com/funwithnetworks/status/1073261360086175744 https://twitter.com/streetAcademic/status/1073272974512402432 https://twitter.com/ronaldomenezes/status/1073256469062135808 https://twitter.com/franzbenjamin/status/1073260111257313283 https://twitter.com/juanrocha/status/1073256000591941634 https://twitter.com/TahaYasseri/status/1073254854078935040 https://twitter.com/ewancolman/status/1073262809725722627

TEDx Aarhus

Fri, 19 Oct 2018 08:26:48 GMT

In the beginning of October, I headed to Aarhus to participate in TEDx Aarhus. It's was a great learning experience *and* excellent fun to participate in that event. The Aarhus TEDx crew were just super pro at everything. That includes excellent, pro-level photos - and they [posted all those great photos to Flickr](https://www.flickr.com/photos/146374044@N06/sets/72157672352152317/with/45334568621/). Below, I'm including the ones related to my talk. ![](/assets/20181019_tedx_aarhus/004_audience2.jpg) ![](/assets/20181019_tedx_aarhus/003_audience1.jpg) ![](/assets/20181019_tedx_aarhus/002_setdesign.jpg) ![](/assets/20181019_tedx_aarhus/010_talk.jpg) ![](/assets/20181019_tedx_aarhus/005_talk.jpg) ![](/assets/20181019_tedx_aarhus/007_audience.jpg) ![](/assets/20181019_tedx_aarhus/009_talk.jpg) ![](/assets/20181019_tedx_aarhus/006_talk.jpg) ![](/assets/20181019_tedx_aarhus/008_talk.jpg) ![](/assets/20181019_tedx_aarhus/011_talk.jpg) ![](/assets/20181019_tedx_aarhus/012_hands_up.jpg) ![](/assets/20181019_tedx_aarhus/014_interview.jpg)

Pantelis Pipergias Analytis visit

Thu, 27 Sep 2018 10:26:46 GMT

On October 9th, we are lucky to have Pantelis Pipergias Analytis visiting the group. Pantelis recently moved as an assistant professor at the *Danish Institute of Advanced Studies* (D-IAS) at the University of Southern Denmark. Before moving to Denmark, he spent the past two years as a postdoctoral researcher at the Computer and Information Science department at Cornell University. Pantelis got his PhD from the Max Planck Institute for Human Development in Berlin. Pantelis will give a talk based on his recent Nature Human Behavior paper *[Social learning strategies for matters of taste](https://www.nature.com/articles/s41562-018-0343-2)* - **Date**: October 9th - **Time**: 13:30 - **Place**: Technical University of Denmark, Building 321, Room 134 **Title**: *Social learning strategies for matters of taste* **Abstract**: Most choices people make are about ‘matters of taste’, on which there is no universal, objective truth. Nevertheless, people can learn from the experiences of individuals with similar tastes who have already evaluated the available options—a poten- tial harnessed by recommender systems. We mapped recommender system algorithms to models of human judgement and decision-making about ‘matters of fact’ and recast the latter as social learning strategies for matters of taste. Using computer simulations on a large-scale, empirical dataset, we studied how people could leverage the experiences of others to make better decisions. Our simulations showed that experienced individuals can benefit from relying mostly on the opinions of seemingly similar people; by contrast, inexperienced individuals cannot reliably estimate similarity and are better off picking the main- stream option despite differences in taste. Crucially, the level of experience beyond which people should switch to similarity- heavy strategies varies substantially across individuals and depends on how mainstream (or alternative) an individual’s tastes are and the level of dispersion in taste similarity with the other people in the group.

Piotr Sapieżyński on Fairness in ranking

Mon, 03 Sep 2018 21:54:16 GMT

Our old friend Piotr, current postdoc at Northeastern, and graduate from the group is visiting from his new home beyond the Atlantic. This coming Thursday, Piotr will give a short about his most recent work. Details below. - **Time:** Thursday, Sept 6th. 11AM - **Location:**Technical University of Denmark.B321, lab-space - ****Title**: Fairness in ranking** ****Abstract**:** Ranked lists of persons and items are a core part of the user experience in many online services, such as search, social media feeds, hiring, and dating sites. Studies have shown disparate amount of attention received by high rank results, potentially leading to loss of opportunity and access to resources among the lower ranked items. In this short talk I will give an overview of the work on individual and group fairness in ranked lists and focus on our work in progress: a novel metric for investigating group unfairness in ranked lists. Our approach relies on estimating the amount of attention given to members of a protected group and comparing it to that group's representation in a defined population. It offers two major developments compared to the state of the art. First, rather than assuming a logarithmic loss in importance as a function of the rank, we allow for attention distributions that are specific to the audited service and the habits of its users. For example, more items are consumed in a single viewing of a social media feed than as a result of a single query in a web search engine. Second, we allow non-binary protected attributes (gender, race, etc.), both to better reflect the way individuals identify, but also to enable measurements on aggregates of multiple search runs, rather than separately for each result list.We investigate the properties of the metric and compare them to the behavior of other established approaches using synthetic ranked lists. Finally, we showcase the metric through a simulated audit of a number of hiring and dating services.

Bernardo Huberman and AI for the Network

Sat, 04 Aug 2018 21:29:53 GMT

Later this month we will have legendary researcher Bernardo Huberman visiting. And we're lucky enough to have him giving a talk on one of the most exciting new developments in Network Science: Applying AI to networking problems. Bernardo has been a central player throughout the rise of network theory (and mentor for field notables, such as Lada Adamic and Jure Leskovec), but that’s just a fraction of what he’s accomplished. If you care about anything related to information sciences, this is a talk you cannot miss. Bernardo is a Fellow and vice president of the Core Innovation Team at [CableLabs](https://www.cablelabs.com/). He is also a Consulting Professor in the Department of [Applied Physics](https://en.wikipedia.org/wiki/Applied_Physics "Applied Physics") and the Symbolic System Program at [Stanford University](https://en.wikipedia.org/wiki/Stanford_University "Stanford University"). Previously he was Senior Fellow and Senior Vice President at [Hewlett Packard Enterprise Company](http://www.labs.hpe.com/), and Director of the Mechanisms and Design Lab at Hewlett Packard Labs. - **Date**: August 29, 2017. - **Time**: 14:00 - **Location**: Technical University of Denmark, Building 321, 1st floor: Room 134 **Title**: Artificial Intelligence and the Network **Abstract**: Artificial Intelligence is the attempt to make computers emulate human cognition and thought processes. It has existed for a long time and has sprouted a number of subfields, from semantic networks and common sense reasoning to robotics, logic programming and machine learning. In spite of the glacial rate of progress in AI, one subfield, machine learning, has recently taken off like wildfire. What powers this incredible growth is the availability of fast processors that have made possible computations than seemed hard to achieve a few years ago. As a result, we now have powerful systems that can easily recognize myriad images and spoken languages. This talk will describe some of the great successes of machine learning, their limitations, and their application to networking problems which pervade modern communications. I will also present a form of artificial intelligence that is distributed in nature and that mimics the ability of groups of people and social insects to solve extremely hard problems.

Max Schich Talk

Mon, 16 Apr 2018 19:26:26 GMT

We're lucky to have Max Schich visiting DTU tomorrow. Max is an associate professor for arts and technology at The University of Texas at Dallas and a founding member of the Edith O’Donnell Institute of Art History. His work converges hermeneutics, information visualization, computer science, and physics to understand art, history, and culture. Schich is the first author of "A Network Framework of Cultural History" (Science magazine, 2014) and a lead co-author of the animation "Charting Culture" (Nature video, 2014). He is an editorial advisor at Leonardo Journal, an editorial board member at Palgrave Communications (NPG), and the Journal for Digital Art History. He publishes in multiple disciplines and speaks to translate his ideas to diverse audiences across academia and industry. His work received global press coverage in 28 languages. Details - **Time**: April 17th, 14:00 - **Location**: DTU, Building 321, first floor lab space - **Title**: Towards a Morphology of Durations **ABSTRACT:** History has no periodic table of elements and no theory of temporal structure, as George Kubler pointed out in 1962, yet, as he also points out, things occupy time in a bounded number of ways. The obvious question still is: Can we capture the shape of time? – Tackling this challenge, this talk looks at historical time systematically, dealing with more or less exponential growth, the archaeological paradox, global and meso-level patterns, cycles, periodicity, condensation, and a bouquet of oddities. Here's a cool video about some of Max's recent work https://www.youtube.com/watch?v=4gIhRkCcD4U

Inaugural lecture

Sun, 08 Apr 2018 13:06:06 GMT

Sometime last year I became an adjunct professor at University of Copenhagen's Department of Sociology. And just to be clear: I'm still primarily the Technical University of Denmark. The adjunct position is more of a way of signaling that I work closely with social science researchers (e.g. through my associate director position at [SODAS](http://sodas.ku.dk)). Anyway, the important thing here is that **I'm finally giving my inaugural lecture**. The lecture is a fun chance for me to reflect on what's happened up to now. My goal is to make the lecture be fun, entertaining, and personal (in a way that I hope will shed light on the mechanics of the scientific process). I hope you'll come and see it. Here are the details: - **Date & Time**: Friday April 20th - **Location**: *Room 35.01.44*, University of Copenhagen. [*It's not super easy to find building 35, so here's special directions: The easiest way is to go to [Gammeltoftgade 15, Copenhagen K](https://goo.gl/maps/kmnEzth5mam) and enter the brand-new building (Building 35), then head to the basement & follow the signs to 35.01.44*] - **[Official link](http://www.soc.ku.dk/instituttet/arrangementer/communities-in-social-networks/).**

Video about Sune

Wed, 24 Jan 2018 15:00:06 GMT

The amazing Villum Foundation (who first funded the SensibleDTU project back in 2012) wanted to showcase some of the work coming out of their Young Investigator Program. As part of that, they have produced a beautiful video about my research (and also a bit about who I am). [vimeo https://vimeo.com/251793047] It's directed by Ole Stenum who shot something like 5 hours of conversation and elegantly captured the essence the whole thing in a 3 minute film. It was fun to try to be part of the whole process.

Benjamin Maier Talk

Mon, 27 Nov 2017 12:42:37 GMT

This Wednesday we will have [Benjamin Maier](http://benmaier.org/about/), a PhD student (physics) from Dirk Brockmann's group speaking at DTU. Ben is based at the Robert Koch-Institut but also affiliated with Humboldt University Berlin's physics department and IRI Life Sciences. His interest lies in identifying the underlying processes of human contact facilitating the spread of diseases. In particular he is investigating the influence of structural properties of human contact networks or human transport networks, both of static and dynamic nature. - **Title**: Flockworks, A class of dynamic network models for face-to-face interactions - **Date**: Wednesday, November 29th - **Time**: 13:30 - **Location**: DTU, Building 321, 1st floor lab space **Abstract**: Studying the dynamics of face-to-face interaction networks is essential for a better understanding of contact mediated processes, contagion processes, and disease spreading. In many studies regarding social systems, networks are reconstructed using time averages or integrated networks, in which links reflect an interaction likelihood, although frequently this measure is not well defined but serves as a qualitative feature from which network properties are computed. During the last years a significant effort was made to resolve this issue by developing algorithms to analyse dynamic processes on the actual time-dependent contact patterns of social systems. However, there is still a lack of simple dynamic network models generating temporal networks of typical behaviour observed in real systems. We introduce a class of minimal dynamic network models that naturally yield group formation and are easy to control. In those models, randomly chosen individual nodes cut their existing links, connect to a target node and establish links to the target's neighbors. We discuss a variety of properties of those models and show how to use it for comparison of epidemic processes on real-world data.

Sune Lehmann Talk

Fri, 03 Nov 2017 11:56:02 GMT

Next week, I'll be giving a talk about SensibleDTU at DTU. I hope the text below explains everything. - **Title**: *Sensible DTU. Is that project still going on? If yes, I wonder what the h&ck they're working on these days?* - **Date**: Tuesday Nov 7th, 2017. - **Time:** 13:00 - **Location**: DTU, Building 324, Room 030 **Abstract**: We've actually been doing a lot of interesting work on the Sensible DTU dataset over the past year or so. (SensibleDTU is the project where we collected + dynamic multilayer network and behavioral data from 1000 smartphones) . This talk covers highlights and goes in depth with the most exciting projects. And you have a chance to ask questions: Maybe there's something in there for you to test your own algorithm on. I'm also considering delivering some deep and personal revelations. **Notice**: The talk will be filmed as part of a movie created by the [Villum Foundation](http://veluxfoundations.dk/en), who funded a large chunk of Sensible DTU. And - if you agree - you might end up in the film.

Hartmut Lentz Talk

Thu, 21 Sep 2017 14:59:56 GMT

For the concluding speaker in our September-talks series, I am happy to announce that [Hartmut Lentz](https://www.fli.de/de/institute/institut-fuer-epidemiologie-ife/wissenschaftler/dr-h-h-k-lentz/) will talk about his work on spread of infectious disease on temporal networks. Hartmut works at the "Institut für Epidemiologie" at the Friedrich Loeffler Institute. He is a fantastic speaker and an authority on temporal networks. Full details below. - **Date**: Thursday, September 28th - **Time**: 1pm - **Location**: DTU, Building 321, 1st floor lab-space - **Title:**Spread of infectious diseases in temporal networks **Abstract:**Many networks are treated as static objects, although they are in fact strongly time-dependent. This can have a dramatic impact on the possible spreading patterns for infectious diseases. A static (aggregated) trade network is constructed as follows: if two nodes are connected directly to each other in a time-dependent network, the same connection is present in the static network. A fundamental difference between the static and the time dependent view however, is the consideration of paths, i.e. indirect connections over more than one edge. Concerning paths, the causality of the edges used plays an essential role. In an aggregated network, paths can seem causal, although they do not follow a time-respecting sequence of edges in the real system. This leads to a systematic overestimation of outbreak sizes, if time-dependent networks are treated as static. We introduce a new method, which allows for the computation of the total causal path structure of a temporal network (represented by its accessibility graph) using the adjacency matrices of its snapshots. In addition, information about the timescales required for path traversal can be derived from the step-by-step derivation of the accessibility graph of the network. This procedure directly yields the distribution of shortest path durations in a temporal network. In addition, we define the new measure causal fidelity that compares the number of paths in a temporal network with its aggregated counterpart. This measure allows a quantitative assessment of how well a temporal network can be approximated by a static aggregated one. The methods presented here require only basic knowledge linear algebra and can be implemented efficiently. Their capability is demonstrated for three examples: networks of social contacts, livestock trade, and sexual contacts.

Philipp Lorenz Talk

Thu, 14 Sep 2017 09:51:35 GMT

The September talk series continues full steam ahead. This week, you have a chance to see [Philipp Lorenz](http://www.itp.tu-berlin.de/nonlinear_dynamics_and_control_empirical_networks_and_neurodynamics/hoevel/mitglieder/doktorandinnen_und_doktoranden/lorenz) talk about dynamics of topics in online social media. Philipp is a PhD student at TU Berlin's *Institut für Theoretische Physik* in the *Nonlinear Dynamics and Control: empirical networks and neurodynamics*group. Phillip's work focuses on temporal communities of hashtags, modeling the rise and fall of online topics, threshold models with repost and recovery, and more. Details of the talk below - **Date**: Tuesday, September 19th - **Time**: 2pm - **Location**: DTU, Building 321, 1st floor lab-space - **Title**: Capturing and modeling the dynamics of online topics **Abstract**: Online media have a huge impact on public opinion, economics and politics. Every day, billions of posts are created and comments are written, covering a broad range of topics. Especially the format of hashtags, as a discrete and condensed version of online content, is in our focus. Here we present a pipeline, consisting of methods from static community detection as well as novel approaches for tracing the dynamics of topics in temporal data. We build co-occurrence networks from hashtags with timestamped edges. On static snapshots we infer the community structure and solve the resulting bipartite matching problem, by taking into account higher order memory. The results are robust to temporal fluctuations and instabilities of the static community detection. The resulting dynamics in various datasets and for different observables, such as the community sizes or the likes they gather, as a proxy for the popularity of a topic, we observe universal behavior. Despite their versatility we find that in all datasets the distributions of gains and losses in popularity are fat-tailed, indicating occasional but large and sudden changes in public interest. We hypothesise that only a few mechanisms may govern this behavior: - Gaining interest follows the rule of preferential attachment . - Saturation of the limited attention span decreases its fame. - discrete ranking leads to a competition between threads. With these ingredients, we are able to design a class of models, which can reproduce the qualitative dynamics and the quantitative distributions of dynamical properties in the empirical observations. The model parameters and the required configuration for a given dataset is informational with respect to the sociological and psychological mechanisms that drive the dynamics of popularity in different contexts.

Bjarke Felbo on Emoji and Emotions

Fri, 08 Sep 2017 15:25:16 GMT

We continue the streak of exciting September talks. This time it's DTU alum (now MIT) Bjarke Felbo who recently caused international press frenzy (see [MIT Technology Review](https://www.technologyreview.com/s/608387/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/), [BBC](http://www.bbc.com/news/technology-40850171), [Newsweek](http://www.newsweek.com/emoji-computer-sarcasm-emotion-training-hate-speech-647474), [Business Insider](http://www.businessinsider.com/this-algorithm-knows-when-youre-being-sarcastic-on-twitter-2017-8), [The Telegraph](http://www.telegraph.co.uk/technology/2017/08/07/oh-great-robots-can-now-tell-people-sarcastic/), [The Register](https://www.theregister.co.uk/2017/08/07/sarcasm_detector_bot_mit/), [Huffington Post (FR)](http://www.huffingtonpost.fr/2017/08/04/cette-machine-sait-reconnaitre-le-sarcasme-sur-les-reseaux-socia_a_23064997/), [Numerama (FR)](http://www.numerama.com/tech/280942-deepmoji-le-reseau-de-neurones-artificiels-qui-detecte-le-sarcasme-grace-aux-emojis.html) for details) with his sarcasm-savvy deep learning algorithm. Now there's a great opportunity you can get all the technical details and ask questions, etc. - **Place**: Technical University of Denmark, Building 210, room 112. - **Date**: September 13th, 2017 - **Time**: 13:00 **Title**: Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm **Abstract**: NLP tasks are often limited by scarcity of manually annotated data. In social media sentiment analysis and related tasks, researchers have therefore used binarized emoticons and specific hashtags as forms of distant supervision. Our paper shows that by extending the distant supervision to a more diverse set of noisy labels, the models can learn richer representations. Through emoji prediction on a dataset of 1246 million tweets containing one of 64 common emojis we obtain state-of-the-art performance on 8 benchmark datasets within sentiment, emotion and sarcasm detection using a single pretrained model. Our analyses confirm that the diversity of our emotional labels yield a performance improvement over previous distant supervision approaches.

Piotr Sapieżyński on Gender and Fairness

Wed, 30 Aug 2017 13:05:08 GMT

Friend of the lab & postdoc at Northeastern University Piotr Sapieżyński is visiting Copenhagen and we're lucky to hear about his ongoing work on FAT (Fair, Accountable, and Transparent) Machine Learning. This talk which focuses on the *fair*part of FAT ML is not one to miss if you want to be on the cutting edge of ethically responsible Machine Learning. - **Date**: September 7th, 2017 - **Time**: 13:00 - **Place**: Technical University of Denmark, Building 321, first floor lab space. **Title**: Academic performance prediction in a gender-imbalanced environment **Abstract**: Individual characteristics and informal social processes are among the factors that contribute to a student's performance in an academic context. Universities can leverage this knowledge to limit drop-out rates and increase performance through interventions targeting at-risk students. Data-driven recommendation systems have been proposed to identify such students for early interventions. However, we find that the performance of some students is best predicted using indicators that differ from those predictive for the majority. Naive approaches that do not account for this fact might favor the majority class and lead to disparate mistreatment in the case of minorities. In this presentation I will talk about behavioral and psychological differences between male and female participants of the Copenhagen Networks Study, and how these differences can contribute to unequal performance in the academic achievement prediction problem. I will also stress the importance of the error analysis in seemingly well-performing predictors and review the approaches to fair machine learning.

Bernardo Huberman Visit

Wed, 16 Aug 2017 15:38:36 GMT

Wow. We are lucky to have legendary researcher Bernardo Huberman visiting later this month. His production of high-impact papers, books, and patents are is too rich and plentiful to reproduce here, so I’ll simply quote Wikipedia’s summary! **Bernardo Huberman** is a Senior Fellow and Senior Vice President at [Hewlett Packard Enterprise Company](http://www.labs.hpe.com/), and Director of the Mechanisms and Design Lab at Hewlett Packard Labs. He is currently a Consulting Professor in the Department of [Applied Physics](https://en.wikipedia.org/wiki/Applied_Physics) and the Symbolic System Program at [Stanford University](https://en.wikipedia.org/wiki/Stanford_University). Bernardo has been a central player throughout the rise of network theory (and mentor for field notables, such as Lada Adamic and Jure Leskovec), but that’s just a fraction of what he’s accomplished. If you care about anything related to information sciences, this is a talk you cannot miss. Here are the details: - **Date**: August 29, 2017. - **Time**: 14:00 - **Location**: Technical University of Denmark, Building 324, Room 040. **Title**: Social media and the attention economy **Abstract**: We are witnessing a momentous transformation in the way people interact and exchange information with each other. Content is now co-produced, shared, classified and rated by millions of people, while attention has become the ephemeral and valuable resource that everyone seeks to acquire. This content explosion is to a large extent driven by a mix of novel technologies and the deep human drive for recognition. This talk will describe the regularities that govern how social attention is allocated among all media and the role it plays in the production and consumption of content. It will also describe how its dynamics determines the emergence of public agendas while allowing predict the evolution of social trends.

Roberta Sinatra visit

Sat, 25 Mar 2017 13:02:09 GMT

I'm very excited to have [Roberta Sinatra](http://www.robertasinatra.com) visiting the group for the week of April 3rd. She is an Assistant Professor at the Center for Network Science and Math Department at the Central European University in Budapest. Roberta works on 'the science of success', her most recent adventures **resulting in two very impressive pieces in the interdisciplinary journal *Science*** (and corresponding world wide press coverage).Check out those papers [here](http://science.sciencemag.org/content/354/6312/aaf5239) and [here](http://science.sciencemag.org/content/355/6324/477). She will give a talk about her work at DTU Compute. Details can be found below. - **Date**: April 4th, 2017 - **Time**: 13:00 - **Location**: Technical University of Denmark, Building 321, 1st floor lab space. **Title**: Quantifying the evolution of individual scientific impact **Abstract**:Despite the frequent use of numerous quantitative indicators to gauge the professional impact of a scientist, little is known about how scientific impact emerges and evolves in time. In this talk we quantify the changes in impact and productivity throughout a career in science and show that impact, as measured by influential publications, is distributed randomly within a scientist’s sequence of publications. This random impact rule allows us to formulate a stochastic model that uncouples the effects of productivity, individual ability and luck, unveiling the existence of universal patterns governing the emergence of scientific success. The model assigns a unique individual parameter Q to each scientist, which is stable during a career and accurately predicts the evolution of a scientist’s impact, from the h-index to cumulative citations. Finally, we show that the Q-parameter is more predictive of independent recognitions, like prizes, than cumulative citations, h-index or productivity.

Michael Szell Visit

Sat, 25 Mar 2017 12:43:27 GMT

We are very lucky to have [Michael Szell](http://michael.szell.net) visiting the week of April 3rd. Micheal is a research fellow at the Hungarian Academy of Sciences, Centre for Social Sciences and visiting at Northeastern University, Center for Complex Network Research. He's previously worked at the MIT Media Lab's Senseable City Lab. Michael's research focuses on a quantitative understanding of collective behavior. How the the underlying patterns of our interlinked actions and decisions can be modeled in computational social science, and his past research involves mining and modeling large-scale data sets of human activity following a complex networks/systems approach. His exciting work has been featured in PNAS, Nature Physics, Science, and many other fine journals. During his visit, Michael will give a talk at DTU Compute. - Date. Tuesday April 4th, 2017 - Time: 14:00 - Location. Technical University of Denmark, Building 321, 1st floor lab space. **Title**: Using network science and data visualization to assess the potential of urban sharing economies **Abstract**: We introduce the notion of shareability network, which allows us to model the collective benefits of sharing rides as a function of passenger inconvenience, and to efficiently compute optimal sharing strategies on massive datasets. We first apply this framework to a dataset of millions of taxi trips taken in New York City, showing that with increasing but still relatively low passenger discomfort, cumulative trip length can be cut by 40% or more. This benefit comes with reductions in service cost, emissions, and with split fares, hinting toward a wide passenger acceptance of such a shared service. Simulation of a realistic online system demonstrates the feasibility of a shareable taxi service in New York City. Shareability as a function of trip density saturates fast, suggesting effectiveness of the taxi sharing system also in cities with much sparser taxi fleets or when willingness to share is low. Indeed, applying the same framework to a diverse set of world cities, using data on millions of taxi trips beyond New York City, in San Francisco, Singapore, and Vienna, we compute the shareability curves for each city, and find that a natural rescaling collapses them onto a single, universal curve. We explain this scaling law theoretically with a simple model that predicts the potential for ride sharing in any city, using a few basic urban quantities and no adjustable parameters. Accurate extrapolations of this type will help planners, transportation companies, and society at large to shape a sustainable path for urban growth. Finally, we present "What the Street!?", an online platform for the interactive exploration of city-wide mobility spaces, published in April 2017. The aim of What the Street!? is to facilitate the intuitive exploration of (wasted) mobility space in cities, exploring why and to which extent space is distributed unevenly between different modes of transportation. We demonstrate how this data visualization of re-ordered city spaces can effectively inform relevant stakeholders and the public about large-scale reductions of parking spaces in future scenarios of wide-spread car-sharing.

Kim Albrecht Visit

Tue, 07 Mar 2017 14:21:48 GMT

We're very lucky to have [Kim Albrecht](http://kimalbrecht.com) visit for a few days later this month. Kim is a gifted visual researcher and information designer. His work is absolutely amazing (beautiful as well as informative). ![Screenshot 2017-03-07 14.16.02](/assets/20170307_kim_albrecht_visit/screenshot-2017-03-07-14-16-02.png) Above is, for example, a summary of the careers of 128 tennis players; read the full story [here](http://kimalbrecht.com/project/untangling-tennis/). We are very lucky to have Kim speaking at DTU later this month!! - **Title**: *Imagining Complex Systems* - **Time**: Tuesday March 28th, 10 AM - **Location**: DTU Building 321, 1st floor lab space ([details](http://sunelehmann.com/directions)) **Abstract:**How can visualization help to understand the world surrounding us? That is the basic underlying question that comes up in all projects that Kim investigated in throughout the last years. This theme sees design as something different than communication or decoration. It is not about a style, a trend or fashion anymore. The design process becomes a tool to create insights and knowledge. But once investigating these created technological artifacts in more depth all the cultural formations forming the graphics come into focus demonstrating the subjectivity of visualization. **Bio:**As a visual researcher & information designer, Kim Albrecht is interested in networks, time, power, processes and how we can find visual representations for these topics to produce and represent knowledge. Currently, Kim is based in Boston, working at the Center for Complex Network Research as a visualization researcher. He collaborates and builds visualization interfaces with research groups from a wide variety of scientific fields and Universities (Harvard University, UCLA, Stanford University). In 2016 Kim started his Ph.D. research at the University of Potsdam in the field of media theory. Researching information visualizations and their interfaces regarding their epistemological value.

DTU Ørsted COFUND Postdoc opportunity

Tue, 17 Jan 2017 17:33:34 GMT

Have you recently finished your PhD? And would you like to come to Denmark to work with deep learning on an amazing dataset? Then keep reading. There's a great opportunity for DTU funding that we can apply for together ## *Proposal*: Deep learning, network structure, and language on Twitter Based on a massive dataset (10% of all tweets going back to 2012), we wish to study the interplay between language and network structure. Specifically, we wish to study the interplay between language evolution and network evolution across time (effectively the co-evolution of language and network structure). As part of the grant application, you will be part of shape the research questions, but a rough idea would be to use deep learning approaches (word embeddings, LSTMs) to represent the language component, and state-of-the-art network science approaches for the network evolution. ## Advisors and further information Academic **advisors** - Associate Professor, Sune Lehmann, DTU Compute. https://sunelehmann.com/ - Professor, Lars Kai Hansen, DTU Compute. http://cogsys.imm.dtu.dk/staff/lkhansen/lkhansen.html/ - Associate Professor, Alan Mislove, Northeastern University. http://www.ccs.neu.edu/home/amislove/ Main **practical** requirements for the COFUND grant - At the time of recruitment (1 July 2017) applicants must not have resided or carried out their main activity in Denmark or at DTU for more than 12 months in the 3 years immediately prior to recruitment (excl. holidays and short visits) - Successful applicants must move to Denmark by the time of employment at the latest; - The applicant must, by the time of recruitment (1 July 2017), be in possession of a doctoral degree or have at least 4 years of full-time equivalent research experience **Detailed info** regarding DTU COFUND: **Next steps** - Contact Sune at [sljo@dtu.dk](mailto:sljo@dtu.dk) for further information. Note that the grant application to COFUND must be a collaboration between faculty at DTU and an interested applicant. ## Qualifications Candidates must have a strong publication record within deep learning and/or network science.

Martin Rosvall Talk

Wed, 09 Nov 2016 09:50:16 GMT

Renowned network scientist and creator of [InfoMap](http://www.mapequation.org/code.html) (probably the world's best community detection algorithm for complex networks), [Martin Rosvall](http://www.tp.umu.se/~rosvall/), is visiting Copenhagen. And I've managed to convince him to visit DTU to give a talk! Martin is an associate professor at the department of physics at the university of Umeå (Sweden). He's an accomplished author of [many highly cited papers](https://scholar.google.dk/scholar?q=martin+rosvall&btnG=&hl=en&as_sdt=0%2C5), and a great speaker. Thus, I strongly recommend you come see his talk. The details are below: - **Time**: Wednesday December 7, 11:00am - **Place**: Technical University of Denmark. Building 321, 1st floor Lab Space. - **Title**: *Maps of sparse Markov chains efficiently reveal community structure in network flows with memory* - **Abstract**: To better understand the flows of ideas or information through social and biological systems, researchers develop maps that reveal important patterns in network flows. In practice, network flow models have implied memoryless first-order Markov chains, but recently researchers have introduced higher-order Markov chain models with memory to capture patterns in multi-step pathways. Higher-order models are particularly important for effectively revealing actual, overlapping community structure, but higher-order Markov chain models suffer from the curse of dimensionality: their vast parameter spaces require exponentially increasing data to avoid overfitting and therefore make mapping inefficient already for moderate-sized systems. To overcome this problem, we introduce an efficient cross-validated mapping approach based on network flows modeled by sparse Markov chains. To illustrate our approach, we present a map of citation flows in science with research fields that overlap in multidisciplinary journals. Compared with currently used categories in science of science studies, the research fields form better units of analysis because the map more effectively captures how ideas flow through science.

New paper in PNAS!

Wed, 24 Aug 2016 21:55:15 GMT

I'm super excited to announce that we recently had a new paper published in PNAS. And by 'we' I mean my former PhD Student [Vedran Sekara](https://vedransekara.github.io/) (first author), my former PostDoc [Arek Stopczynski](https://stopczynski.com/), along with yours truly. ![screenshot-2016-09-06-19-43-09](/assets/20160824_new_paper_in_pnas/screenshot-2016-09-06-19-43-09.png) Figure 0: [Update Sept 6th, 2016] Extra excited that we made the cover of the printed version! You can download the whole thing by [clicking right here](http://www.pnas.org/content/early/2016/08/22/1602803113.abstract). In addition to the paper itself, PNAS published a nice commentary on the paper [written by Renaud Lambiotte](http://www.pnas.org/content/early/2016/08/22/1612364113.extract) [although that one is behind a paywall]. I'm very proud of the work we've done, and somehow we got away with giving the paper the not-so-humble title *Fundamental Structures of Dynamic Social Networks.*The cool thing is that even though the title is perhaps ostentatious, I actually think that we're on to something fundamental here. I've attempted to write a non-technical explanation below. ## Prologue: The connection to communities Community detection is a big deal in network science. Just look at this plot I created that shows the number of papers about [community detection](https://en.wikipedia.org/wiki/Community_structure) per year. ![Screenshot 2016-08-25 12.59.19](/assets/20160824_new_paper_in_pnas/screenshot-2016-08-25-12-59-19.png) Figure 1: Number of papers of community detection published per year, based on data from Google Scholar [1]. There are literally ***thousands of papers*** that address the topic of finding communities in networks published every single year, so in my world this is an important topic. Detecting communities in complex networks is usually all about finding groups of nodes with many links between then - and only few links to the rest of the network. The typical example network in a community detection paper looks something like this: ![wikipedia.svg](/assets/20160824_new_paper_in_pnas/wikipedia-svg.png) Figure 2: A sketch of the communities world-view that YY, Jim, and I argued against in our 2010 paper in Nature. Illustration from wikipedia [2] Back in 2010, [YY Ahn](http://yongyeol.com), [Jim Bagrow](http://bagrow.com) and I [wrote a paper](http://www.nature.com/nature/journal/v466/n7307/abs/nature09182.html) where we argue that there's something fundamentally wrong with this idea of communities. The problem is that the illustration above assumes that each node is a member of only *one single community*. In that paper we argue that this assumption is often wrong. In most networks, each node is a member of more than one community. In social networks, for example, we are in communities of friends, family, co-workers, sports buddies, etc. When each node is a member of many communities, the global picture gets more messy. The network doesn't fall apart into neat chunks as above, rather it looks like a mess of a hairball. [I've written a popular explanation of those findings [here](https://sunelehmann.com/2010/03/23/worlds_colliding/) plus a follow-up [here](https://sunelehmann.com/2010/06/29/pervasive-overlap/).] The hairball below shows a real social network from the PNAS paper. ![Screenshot 2016-08-24 14.29.48](/assets/20160824_new_paper_in_pnas/screenshot-2016-08-24-14-29-48.png) Figure 3: The hairball. Back then, we did not have access to temporal information, but as part of trying to wrap our brains around how this hairball arises, Jim, YY, and I came up with the picture below (Jim actually drew it and impressively figured out how to do the perspective). This illustration - as we shall see below - turned out to be quite prophetic. ![Screenshot 2016-08-24 14.33.59.png](/assets/20160824_new_paper_in_pnas/screenshot-2016-08-24-14-33-59.png) Figure 4: An early sketch Jim, YY, and I came up with in order to understand how the hairball arises from simple, but overlapping communities. The illustration shows that when single individuals (marked in green and turquoise) participate in multiple communities the underlying simplicity is obscured in the aggregated network. ## Gatherings I had forgotten all about communities when my graduate student Vedran and I started looking at the incredible detailed data my group had just started collecting as part of the [Copenhagen Networks Study](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095978) (CNS). CNS contains 2.5 years of data collected by handing out 1000 smartphones to nearly all the DTU freshman students, collecting physical proximity data (using Bluetooth to measure the distance between pairs of individuals), phone calls, text messages, Facebook interactions, as well as GPS data. All of this with high temporal resolution (e.g. we recorded face-to-face meetings every 5 minutes) Working as lead hacker-in-residence on top of his data science duties, Arek used a mix of 26-hour days & what I can only describe as pure black magic to start almost from scratch and orchestrate the software infrastructure needed to collect and store all of these data sources in something like six months.With CNS we finally had access to the temporal networks dataset needed to dig deeper. When we looked at the physical proximity data we noticed that, as we considered finer and finer time resolution, the hair-ball (beautifully) dissolved into meaningful structures. ![Screenshot 2016-08-24 21.25.11](/assets/20160824_new_paper_in_pnas/screenshot-2016-08-24-21-25-11.png) Figure 5: At the right temporal resolution (blue network), we can observe communities directly, no community detection needed. The green network shows interactions aggregated over one day. The orange network shows interactions aggregated over 1 hour, and the blue network is a single 5-minute timeslice. The green hairball shows everyone who has spent time together across an entire day. The orange network shows physical contacts aggregated over an hour, and the blue network shows the interactions for a five-minute time slice. The exciting thing is that in the blue network, ***we can directly observe the groups of people hanging out together***. No community detection necessary - we had solved that question those thousands of papers in Figure 1 are addressing, simply by changing the temporal resolution [3]. Said differently we've just identified a case where understanding the network got ***easier by adding more data (***That's why Renaud's commentary is called "Rich Gets Simpler"). Usually it's the opposite. Things usually get a lot more complex when we have to account for more data. Just check any paper on temporal networks (for example take a look at [this excellent review](http://www.sciencedirect.com/science/article/pii/S0370157312000841)). I take the fact that more data has simplified the problem to mean that we're on to something: that we're looking at the network represented at the right temporal resolution. Anyway. We'd just found out how to identify all of the little communities in a timeslice. Now we needed to put the pieces together again. But since we'd figured out the underlying simple principle, we began to study how meetings between people develop over time - simply by matching up groups between neighboring timeslices. The result is ***gatherings***- the temporal representation of a meeting between a group of individuals that can last anywhere from 15 mins to several hours. We have [a great visualization](https://ulfaslak.com/portfolio/Visualisation/) (with accompanying explainer-video, embedded below) that beautifully describes what gatherings are and how they work. Check out the video, it's only 90 seconds long. https://www.youtube.com/watch?v=YpBkH\_0jq2g The visualization was created by Ulf Aslak Jensen, a newly started PhD student in my group. *And it is officially awesome*: earlier this year it won *Science* Magazine's [Data Stories Competition](http://www.sciencemag.org/projects/data-stories/winners)! ## Cores But, while they're already great and exciting, gatherings are only the beginning of the story. If a group of people have a real social connection, they meet again and again. We call gatherings that occur repeatedly, ***cores**.*It is the cores that are the 'fundamental structures' that organize/simplify the dynamics we observe on the network. Let's dig deeper. First, let's think about what the network looks like from the perspective of a single node. Below, we show an example from a real (and representative) individual. ![Screenshot 2016-08-25 09.21.56](/assets/20160824_new_paper_in_pnas/screenshot-2016-08-25-09-21-56.png) Figure 6: The cores in which a single node participates. Each cores is highlighted using a brightly colored background. Note how cores can be overlapping and nested. ## Instead of modeling each and every interaction in the network, we now have a framework that allows us to think about a node's social activity in a different way. We are able to think about the node as participating in a *sequence of gatherings*, where each gathering is an instance of a core. The node pictured above is a member of 9 cores, each of which has gathered multiple times. If we plot *when in time* each core is active, it looks like this: ![Screenshot 2016-08-25 09.22.13](/assets/20160824_new_paper_in_pnas/screenshot-2016-08-25-09-22-13.png) Figure 7: How the central, red node from Figure 6 participates in gatherings over time. The x-axis shows time, with light-gray vertical bars corresponding to weekends and dark-gray vertical bars shows holidays. Each horizontal row corresponds to a core, and each time that core is active, we highlight the corresponding timeslot. Colors match Figure 6. We call this pattern of interactions a person's *social trajectory*, because we can think of the person's journey through the network as jumping from core to core - from social context to social context. It is a massive simplification of the hairball from Figure 3. And it is this simplification - the fact that we are now able to think about dynamic social network in terms of cores and their activations - that I think is the paper's main contribution. (Plus, having seen how the cores work, I hope it's clear why I said that Figure 4 turned out to be a nice representation of what's actually happening in real networks. ) ## Applications In the paper we also spend quite a bit of time showing how this simplification is, in fact, useful for a number of purposes. But since this post is probably already a bit **tl;dr** I'll save a detailed description of those results for another day. But I'll summarize them here. **Firstly**, we show that we can use cores to predict where people will be in the future. The idea is simple. A core is a 'real' object in the network in the sense that when we see a gathering, *all of its members must be present.* This means that observing a part of a core is a signal that soon we'll soon see the remaining members. In the paper we look at cores of size three and show how a sighting of two core members signals the arrival of the third group member. **Secondly**, we realized that social trajectories have a lot in common with spatial trajectories. Spatial trajectories describe how we move from location to location. From 'home' to 'work 'to 'supermarket', etc. The fact that we move through social contexts (cores) just like we move through physical space opens an interesting connection to work on human mobility. Specifically, we connect the work on cores to a seminal paper on *[Limits of Predictability in Human Mobility](http://science.sciencemag.org/content/327/5968/1018),*which showed that for most people, given a sequence of past locations, the next location can be predicted with high accuracy [4]. We find a similar level of predictability given social trajectories, as well as an interesting interplay between the social and geo-spatial predictability (when people are highly unpredictable wrt. their location, they tend to be highly predictable wrt. their social context). ## Rounding up There is much more in the actual paper. For example, we talk about how the cores leave traces in other communication channels. And the paper also contains the technical details (although a lot of them are contained in the massive Supporting Information document). I will write more about the predictability results in a later post (since those findings are actually pretty cool as well). In summary, I hope that I've managed to give you a sense of the paper's central contribution - and perhaps also provided a bit more of an explicit link to the literature (including my own past research) than is readily available from the paper. ## Notes [1] The data was retrieved using the following Google Scholar search query: (“complex network” OR “complex networks” OR “network data”) AND (“community detection” OR “community assignment” OR “network community” OR “network communities” OR “community finding”). The idea for that query comes form Conrad Lee. [2] I grabbed this illustration from wikipedia. It's by j\_ham3 - Own work. Licensed under Creative Commonse BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=17125894. [3] I'm exaggerating a little bit for effect here. The approach we're discussing only works for systems where people are actually meeting face-to-face. Community detection in phone call networks or Facebook is a different story. [4] It's a little bit confusing because we're talking about two distinct kinds of predictability. The predictability related to a sequence of location/social contexts has to do with to the amount of routine in someone's behavior.

PostDoc Jobs

Sun, 12 Jun 2016 15:52:33 GMT

[Note: Thanks for the many emails on this!! I will post new openings right here on this blog when they arise.] I'm currently involved in two super-exciting projects that **are currently hiring postdocs**, so if you ever thought about moving to Copenhagen to do great science, now is the time. And with all these job postings, might even be able to bring a (scientifically outstanding) friend. As you probably know Denmark continues to be [the happiest country on the planet](http://www.bbc.com/news/world-europe-35824033), and the food & drink is amazing - with [almost too many Michelin stars](http://www.visitdenmark.com/denmark/gastronomy/denmark-sets-michelin-record-26-stars-and-first-ever-3-star-restaurant) and a true abundance of [hipster beer](https://en.wikipedia.org/wiki/Mikkeller). The scientific environment is also pretty great (if I have to say so myself). We've built an amazing group around the [Copenhagen Networks Study](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095978) (you may remember that we handed out 1000 smartphones to freshmen at [DTU](http://www.dtu.dk/english) and collected network data for 2.5 years), and recently strengthened the efforts with the Copenhagen Center for Social Data Science (SODAS) at University of Copenhagen (where I'm now associate director). So we have a nice critical mass of interesting graduate students & postdocs with whom to spar, hang out, and grab lunch. What I love about these projects is that they're truly Data Science in the Drew Conway *Data Science Venn Diagram* (see below) sense of the word ![Data_Science_VD](/assets/20160612_postdoc_jobs/data_science_vd.png) In the language of the illustration above, we want you to have hacking skills + math and stats knowledge. What we offer is projects are carried out in close collaboration with people that have domain knowledge. (I'm beginning to have experience with this kind of project, and it's completely amazing and refreshing to have a partner who can actually help place your data-driven results in context.) The projects are: - ***Twitter Bots**.* This is the data-science component of a larger project, directed by political scientist, Prof Rebecca Adler-Nissen. The successful candidate will be associated with SODAS and work at closely with my group as well as political scientists exploring various qualitative aspects. The full project title is *Digital Disinformation: Exploring the Influence of Disinformation on Western Public Debate*. This is extra fun because I actually have some practical experience [building twitter bots](https://sunelehmann.com/2013/12/04/youre-here-because-of-a-robot/). - ***Network Analysis of case law from international courts***. This one is in collaboration with [Henrik Palmer Olsen](http://research.ku.dk/search/?pure=en%2Fpersons%2Fhenrik-palmer-olsen(d6151186-5867-4059-bc35-8e82ae4be33e).html) at the faculty of law, and the successful candidate will be formally associated with both the faculty of law and my research group. Read more **[here](http://employment.ku.dk/faculty/?show=828905)**. We're also collaborating with natural language processing expert [Anders Søgaard](http://cst.dk/anders/)on both of these projects. (Both networks have lots of text metadata associated with each node, so it's kind of fantastic to have an NLP expert on the team). Both positions are connected to a specific project where you're expected to deliver certain results, but we support 'blue sky' research and once you're set up, we welcome your ideas, and participation in ongoing research topics. > **Action item**: If you're interested, **send me an email**! PS. We're also looking for a **third postdoc** in a more NLP minded project for the *[Data Transparency Lab](http://www.datatransparencylab.org),* this one helmed by [Anders](http://cst.dk/anders/)*.* This project is about how well we can identify an author in large text corpus based on e.g. their tweets. There's a great team on this one listed in Anders' tweet below. https://twitter.com/soegaarducph/status/741936618395832320

Esteban Moro Talk

Tue, 07 Jun 2016 21:45:19 GMT

Monday June 13th is shaping up to be an exciting day for data science in Copenhagen. I've already [announced that Christo Wilson is giving a talk at DTU](http://sunelehmann.com/2016/05/25/christo-wilson-talk/), but now I'm happy to add Esteban Moro to the speaker line-up for a fantastic double bill. (And Piotr's PhD defense at 2pm that afternoon will also be quite an event) Esteban Moro is a researcher at [Universidad Carlos III de Madrid](http://www.uc3m.es/) in the [GISC group](http://www.gisc.es/) working on complex systems. On his [superb blog](http://estebanmoro.org/) he notes that "The fact that the systems under study are complex does not mean that its behavior cannot be understood or anticipated. I believe research must be interdisciplinary and close to real life problems and because of that, I do research in [social networks](http://estebanmoro.org/tag/social-networks/), [financial markets](http://estebanmoro.org/tag/finance/) or [viral marketing](http://estebanmoro.org/tag/viral/) (complex enough!)". Esteban's work is creative, inspiring, and always exciting (plus [often covered in the press](http://estebanmoro.org/category/media/)). We are lucky to have him. The details of Esteban's talk are - **Time** Monday June 13th, 10:45am - **Place**: DTU, Building 321, 1st floor lab space **Title**: Pace of change in urban social networks **Abstract**: Urban communities are seen both as highly structured social settings as well as distinctly vibrant environments for interaction, where personal relationships are initiated, consolidated and, eventually, lost and replaced by new relationships. Here we investigate statistical relationships between the social structure of the urban community and the pace at which such structure changes over time. To this end, we analyze the 19-month evolution of the social interactions pertaining to urban communities in England, Wales and Scotland, as described by 700 million of mobile phones calls made among 20 million inhabitants. We find that different urban communities display not only distinct social structures but also alter such structures at widely different paces. Furthermore, we investigate the impact of this heterogeneity in the network varying structure on information diffusion processes by simulating SI models. Our results indicate that time to infection can be well predicted using only static variables of the network, such as the number of connections, leading to the conclusion that the observed vibrant mechanics in link creation have a negligible impact on the information diffusion in terms of geographical spreading.

Christo Wilson Talk

Wed, 25 May 2016 11:46:59 GMT

A PhD defence is a great way to bring interesting people to Denmark, and Piotr's defense on June 13th is no exception. This time we're lucky to have recent NSF Career grant recipient [Christo Wilson](http://www.ccs.neu.edu/home/cbw/) from Northeastern University visiting. Christo's work includes auditing algorithms, security and privacy, and online social networks. Much of his work focuses on using measured data to analyze and understand complex phenomena on the Web. In many cases, he has leveraged the knowledge gained from measurements of the Web to build systems that improve security, privacy, and transparency for users - and getting [lots of nice press coverage in the process.](http://www.ccs.neu.edu/home/cbw/press.html) - **Time**: Monday June 13th, 10am - **Location**: DTU, Building 321, 1st floor lab space **Title**: Caught Red Handed: Tracing Information Flows Between Ad Exchanges Using Retargeted Ads **Abstract**: Numerous surveys have shown that Web users are seriously concerned about the loss of privacy associated with online tracking. Alarmingly, these surveys also reveal that people are also unaware of the amount of data sharing that occurs between ad exchanges, and thus underestimate the privacy risks associated with online tracking. In reality, the modern ad ecosystem is fueled by a flow of user data between trackers and ad exchanges. Although recent work has shown that ad exchanges routinely perform cookie matching with other exchanges, these studies are based on brittle heuristics that cannot detect all forms of information sharing, especially under adversarial conditions. In this study, we develop a methodology that is able to detect client- and server-side flows of information between arbitrary ad exchanges. Our key insight is to leverage retargeted ads as a mechanism for identifying information flows. Intuitively, our methodology works because it relies on the semantics of how exchanges serve ads, rather than focusing on specific cookie matching mechanisms. Using crawled data on 35,448 ad impressions, we show that our methodology can successfully categorize four different kinds of information sharing between ad exchanges, including cases were existing heuristic methods fail.

Data Stories Winner

Wed, 25 May 2016 11:29:50 GMT

Ulf Aslak Jensen, who's writing his M.Sc thesis in my group (well, actually he's at the Weizman institute working with Uri Alon, but that's another story) has just ***won*** *[Science Magazine](http://www.sciencemag.org)*'s [Data Stories competition](http://www.sciencemag.org/projects/data-stories) with the following video about a cool visualization he created based on SensibleDTU data. https://www.youtube.com/watch?v=YpBkH\_0jq2g Ulf has gotten lots of nice coverage, both internationally - Forbes: [Scientific Findings Visualized: The Winners Of The AAAS Data Stories Video Contest](http://www.forbes.com/sites/kevinmurnane/2016/05/19/scientific-findings-visualized-the-winners-of-the-aaas-data-stories-video-contest/#6b869bf61552) - PopSci: [Winning Data Stories Explain how the World Works in 90 Seconds.](http://www.popsci.com/2016s-winning-data-stories-explore-how-our-worlds-works-in-90-seconds) And in the local Danish Press - DTU News [in Danish]: [Kreativ videnskabsformidling vinder Science-konkurrence](http://www.dtu.dk/Nyheder/2016/05/Kreativ-videnskabsformidling-vinder-konkurrence?id=3201df95-6059-4a2e-9976-533e71164c0b) - Videnskab.dk [in Danish]: [DTU-studerende vinder Science-pris for kreativ formidling](http://videnskab.dk/kultur-samfund/dtu-studerende-vinder-science-pris-kreativ-formidling) - Copenhagen Post: [DTU student wins prestigious science award](http://cphpost.dk/news/dtu-student-wins-prestigious-science-award.html) Nice work!!

Dave Choffnes visit

Thu, 12 May 2016 11:52:44 GMT

Next Thursday, we're lucky to have Dave Choffnes visiting the lab. David Choffnes is an assistant professor in the College of Computer and Information Science at Northeastern University. His research is primarily in the areas of distributed systems and networking, with a recent focus on mobile systems and privacy. Much of his work entails crowdsourcing measurement and performance evaluation of Internet systems by deploying software to users at the scale of tens or hundreds of thousands of users. He earned his PhD from Northwestern (not in the northwest), and completed a postdoc at the University of Washington (in the northwest) prior to joining Northeastern (both in the northeast and northwest). He sees no reason why this should at all be confusing. He is a co-author of three textbooks, and his research has been supported by the NSF, Google, the Data Transparency Lab, VidScale, M-Lab, and a Computing Innovations Fellowship. - **Time**: Thursday May 19th, 11am - **Location**: DTU, Building 321, 1st floor lab space **Title**: ReCon: Identifying and Controlling Privacy Leaks from Mobile Devices **Abstract**: Mobile systems have become increasingly popular for providing ubiquitous Internet access; however, recent studies demonstrate that software running on these systems extensively tracks and leaks users' personally identifiable information (PII). I argue that these privacy leaks persist in large part because mobile users have little visibility into PII leaked through the network traffic generated by their devices, and have poor control over how, when and where that traffic is sent and handled by third parties. [![](/assets/20160512_dave_choffnes_visit/img_0008_662ac121.png)](/assets/20160512_dave_choffnes_visit/img_0008.png) In this talk, I describe ReCon, a cross-platform system that reveals PII leaks and gives users control over them without requiring any special privileges or custom OSes. Specifically, our key observation is that PII leaks must occur over the network, so we implement our system in the network using a software middlebox. We then use a machine learning approach to to efficiently and accurately detect users' PII without knowing a priori the content that is PII. Further, we develop techniques to block, obfuscate, or ignore the PII leak, by displaying leaks via a visualization tool and letting the user decide how the system should act on transmitted PII. I discuss the design and implementation of the system and evaluate its methodology with measurements from controlled experiments and flows from a user study with more than 100 participants. In addition to revealing and controlling PII leaks, we are using our machine-learning-based techniques to automatically identify and block malware based on network behaviors.

United Stats of Words

Sat, 09 Jan 2016 14:06:48 GMT

In early December, [Alan Mislove](http://www.ccs.neu.edu/home/amislove/) (who's spending his sabbatical here in Copenhagen) and I, got the Volvo and headed out to the Amager Campus of University of Copenhagen to pick up [Anders Søgaard](http://cst.dk/anders/), a professor of linguistics, to work on a top secret research project. The project itself is still classified, but one of the things we're looking into is word-usage in geo-coded tweets across the globe (to begin with, just America). To do this, Alan has trawled through something like 65 billion tweets and extracted the ones with geotags (1-2% of all tweets) further grabbing the ones that are from the US (about a third of those), ending up with a set of around 450 million geotagged tweets. We couldn't help ourselves - this dataset was just too cool not to visualize. And because Alan is a wizard, you can try this out for yourself on . Once this thing hit twitter, people found lots of fantastic examples, and I've included some of my personal favorites below https://twitter.com/rlmcelreath/status/685496518049181698 https://twitter.com/CT\_Bergstrom/status/685622838523375616 https://twitter.com/CT\_Bergstrom/status/685393728043786240 https://twitter.com/eigenfactor/status/685378591576571904 [tweet 685379845631574016 hide\_thread='true'] https://twitter.com/TrevorABranch/status/685396455486406656 https://twitter.com/TrevorABranch/status/685397387204902912 https://twitter.com/TrevorABranch/status/685527532314079237 https://twitter.com/pkedrosky/status/685513735973240833 Try it out for yourself - but be warned - it's pretty darn addictive. # States Individual states show up very nicely https://twitter.com/CT\_Bergstrom/status/685384542421553152 Typing in "mississippi" will show both the river and the state. https://twitter.com/adamgreatkind/status/685615156311797760 https://twitter.com/suneman/status/685188478586515457 Even countries work nicely https://twitter.com/ALL\_CAPS/status/685338232444174336 And (full disclosure), the title for this post was inspired by this tweet, although I like my own little tweak. https://twitter.com/argonaut/status/685616564217905152

Not a bad year

Mon, 04 Jan 2016 09:32:02 GMT

As we enter the new year, it's always fun to reflect on the year that's just passed. And it's been a good one. So good that I almost entitled this post "Everything is awesome". Below is a list containing a lot of the stuff I should have written about during the year. ## Graduation day Back in June, [Vedran Sekara](https://vedransekara.wordpress.com/) became the first PhD graduate from my group. His thesis was on *Dynamics of High Resolution Networks -* a fine piece of work. And we were lucky to have Petter Holme and James Bagrow visit to be on the committee; it was great to see them both again. ![2015-06-19 20.29.26](/assets/20160104_not_a_bad_year/2015-06-19-20-29-26.jpg) Upon graduating, Vedran landed a nice job with Sony (Lund offices) as a data scientist. He's still a visiting researcher in the group and we're currently collaborating on a few super interesting projects based on Sony's LifeLog App data. ## Arek @ Google And Vedran is not the only person with a cool new job. Arek Stopczynski, a senior postdoc in my group (and all-round awesome data scientist) has landed a super exciting job with Google in California. https://twitter.com/h0pbeat/status/647153800197881856 Arek's work with Google is (of course) top-secret, but they're lucky to have him! ## Alan Mislove Also this year, good friend, brilliant computer scientist, and associate professor at Northeastern University, Alan Mislove (+ familiy) is spending his sabbatical here in Denmark, with Alan visiting my group. Having him around is not only a lot of fun, but also enlightening ... and we have a few exciting projects in the 'under construction' phase. And Alan is going to be around for another six months :) ## **Still young** For me, it was a big deal to receive the *Sapere Aude* Young Investigator Grant from the Danish Council for Independent research. The grant title is *Microdynamics of Influence in Social Systems,* and you can read a popular description of it **[here](http://ufm.dk/forskning-og-innovation/rad-og-udvalg/det-frie-forskningsrad/for-ansogere/hvad-kan-du-soge/sapere-aude/forskningsleder-portraetter/dff-forskningsleder-2015/sune-lehman)** (it's in Danish). This grant is not easy to win, and will keep me in business for the next few years. ## Sune & Hal Varian And more in Google (and other) news. In September, I gave a talk at the event "[Big data til gavn for vækst og velfærd - en unik dansk mulighed](http://www.dst.dk/da/OmDS/maal-og-resultater/faglige-arrangementer/afholdte-arrangementer)", which took place at the [Danish National Museum](http://en.natmus.dk/) (a pretty cool venue). I gave the talk with collaborator and all-round great guy [David Dreyer Lassen](http://forskning.ku.dk/find-en-forsker/?pure=da/persons/28460).This event, however, had some pretty cool remaining speakers, which included Hal Varian who's [Google's chief economist](http://www.economist.com/blogs/schumpeter/2012/06/qa-googles-chief-economist-hal-varian) and arguably one of the most influential people on the planet. https://twitter.com/DSTdk/status/641594523496370176 There were [other fancy speakers](http://www.ft.dk/samling/20142/almdel/ufu/bilag/13/1542280/index.htm), for example the Danish Minister of the Interior ("Social- og indenrigsminister") Karen Elleman. ## Press coverage This year, my group received lot's of nice press coverage. Below is a selection. #### On TV! As a first, fun thing I was interviewed on TV for the first time. It was just a local Copenhagen channel, but it was still scary to be right there in a pro studio being interviewed "live on tape". Oh and the interview (which is in Danish) was about the *Science* paper *[Unique in the Shopping Mall](http://www.sciencemag.org/content/347/6221/536.abstract)* by some of our good friends and collaborators at MIT. There were a couple of additional videos about our works. One created by DEIC as part of their new e-Science knowledge portal. Watch it **[here](http://vidensportal.deic.dk/node/65)**. And German TV also sent a crew to report on the SensibleDTU experiment. #### Router Crazyness Another big event was my PhD studen Piotr Sapiezynski's paper *[Tracking Human Mobility Using Wifi Signals](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130824).*The paper is about how easy it is to recreate human mobility traces using the routers that our smartphones connect to. And has a nice [explainer site](https://dev.sensible.dtu.dk/pisa/wifilookup/). I also [wrote about it on this blog](http://sunelehmann.com/2015/05/26/tracking-human-mobility-using-wifi-signals/) and tweeted this: https://twitter.com/suneman/status/603184362264338432 But that was just the beginning. That post was by far my most read in the history of this blog and still skews the month-to-month statistics. ![](/assets/20160104_not_a_bad_year/screenshot-2016-01-04-09-14-28.png) And the paper was covered widely, also in the international press, for example the Atlantic's CityLab: https://twitter.com/suneman/status/619260578759094272 #### Lots of other coverage We also received lots of other nice Press coverage. I was in the DTU paper talking about how academics can use Twitter. You can find a link in the nice tweet from The Danish Agency for Science, Technology and Innovation (Forsknings og Innovationsstyrelsen). https://twitter.com/UFM\_FI/status/654221717125103616 We were also covered in [Politiken](http://politiken.dk/forbrugogliv/digitalt/internet/ECE2835836/mobiltelefoner-sladrer-om-studerendes-venskaber-smser-og-facebookvaner/) and in the magazine [Dynamo](http://emagstudio.win.dtu.dk/E-books/DYNAMO/Dynamo_42/#/10/) with a beautiful photo spread, featuring [Andrea Cuttone](http://andreacuttone.com)'s beautiful graphics. ![](/assets/20160104_not_a_bad_year/screenshot-2016-01-04-09-40-38.png) ![](/assets/20160104_not_a_bad_year/screenshot-2016-01-04-09-40-53.png) Also, [my paper](http://journal.frontiersin.org/article/10.3389/fphy.2015.00073/full) with Jari Saramäki and Talayeh Aladavood also got lots of coverage, below are a couple of examples: https://twitter.com/TAledavood/status/655689625583927296 https://twitter.com/JariSaramaki/status/652032119590494208 Finally, Vedran and I wrote about Network Science in a danish popular physics journal and made with cover with one of Vedran's beautiful visualizations. https://twitter.com/vedransekara/status/645168515603931136 The full details on all of this can be found on the Press page, when I get around to updating that. ## Great exchange visits This was also the where year two of my PhD students were spending 6 months of their program abroad (this is standard for Danish PhD students). Piotr Sapiezynski visited [Jure Leskovec at Stanford](https://cs.stanford.edu/people/jure/) and Andrea Cuttone is still visiting [Marta Gonzalez at MIT](http://humnetlab.mit.edu/). Feel very lucky to be able to send the guys out to these groups that are among the most exciting places on the planet. ## Coursera course And I also created a Coursera version of my *Social Graphs and Interactions*course. Here's a link to the course page: . The video explains it pretty well. ## Digital Halo grant We also got a very nice grant from the [Data Transparency Lab](http://www.datatransparencylab.org) to study browsing behavior. https://twitter.com/h0pbeat/status/609363782889029632 We were in excellent company - the other grantees were from prestigious universities like Princeton University, Carnegie Mellon University, Northwestern University, Columbia University, [and many other fine schools](http://www.datatransparencylab.org/grantees.html). Here's a little 40 sec. video explaining the project. https://www.youtube.com/watch?v=UuRF8iDUmVg #

Great trip to Cologne

Thu, 03 Dec 2015 07:06:06 GMT

I recently went to the excellent [GESIS CSS Winter Symposium](http://www.gesis.org/css-wintersymposium/program/) in Cologne. The symposium was brilliantly organized by [Markus Strohmaier](http://markusstrohmaier.info), who has grown it into a major Computational Social Science event (this year 290 participants) within a very short time. So many interesting people to talk to! My talk was about *Fundamental Structures of Complex Social Network*and based on forthcoming work with [Vedran Sekara](https://twitter.com/vedransekara). I had a nice timeslot and received lots of Twitter love (reproduced below for easy access), which - I have to admit - feels pretty darn great when you've worked hard to create some exciting science! https://twitter.com/levordashka/status/671985257986527232 https://twitter.com/yangzhanglu/status/671988168229593088 https://twitter.com/strippel/status/671987616192049152 https://twitter.com/clauwa/status/671982157330579456 https://twitter.com/TahaYasseri/status/671978963598749696 https://twitter.com/AnnySml/status/671982302587670528 https://twitter.com/Betrium/status/671984280025853953 https://twitter.com/digprof/status/671988089183715329 https://twitter.com/socialbeit/status/672004481257705472 https://twitter.com/6884/status/671987960066240513

Digital Halo Video

Tue, 27 Oct 2015 11:36:08 GMT

Yesterday we put together a video that briefly (40 seconds) describes our ongoing [Data Transparency Lab](http://www.datatransparencylab.org) Project *[Digital Halo](http://www.datatransparencylab.org/grantees.html).*You can take a look below. https://www.youtube.com/watch?v=UuRF8iDUmVg Big thanks to Mieszko and Kelton for invaluable planning/design/production input & help.

Alan Mislove

Tue, 15 Sep 2015 15:10:17 GMT

This whole year, we're lucky enough to have collaborator & all-round awesome guy Alan Mislove spending his sabbatical connected to my group. Alan is an associate professor College of Computer and Information Science at Northeastern University. His research concerns distributed systems and networks, with a focus on using social networks to enhance the security, privacy, and efficiency of newly emerging systems. He is a recipient of an NSF CAREER Award (2011), and his work has been covered by the Wall Street Journal, the New York Times, and the CBS Evening News. In October, Alan will give a talk about recent work that has been widely covered in the media - and that I think will be interesting to many of you - the details are here: - **Date & Time**: October 9th, 2015, 11am - **Venue**: DTU, Building 321, first floor lab space - **Title**: Measuring personalization of online services - **Abstract**: Today, many web services personalize their content, including Netflix (movie recommendations), Amazon (product suggestions), and Yelp (business reviews). In many cases, personalization provides advantages for users: for example, when a user searches for an ambiguous query such as "router," Amazon may be able to suggest the woodworking tool instead of the networking device. However, personalization is rarely transparent (or even labeled), and has the potential be used to the user’s disadvantage. For example, on e-commerce sites, personalization could be used to manipulate the set of products shown (price steering) or by customizing the prices of products (price discrimination). Unfortunately, today, we lack the tools and techniques necessary to be able to detect when personalization is occurring, as well as what inputs are used to perform personalization. In this talk, I discuss my group's recent work that aims to address this problem. First, we develop a methodology for accurately measuring when web services are personalizing their content. While conceptually simple, there are numerous details that our methodology must handle in order to accurately attribute differences in results to personalization (as opposed to other sources of noise). Second, we apply this methodology to two domains: Web search services (e.g., Google, Bing) and e-commerce sites (e.g., BestBuy.com, Expedia). We find evidence of personalization for real users on both Google search and nine of the popular e-commerce sites. Third, using fake accounts, we investigate the effect of user attributes and behaviors on personalization; we find that the choice of browser, logging in, and a user's previously content can significantly affect the results presented.

Andrea Baronchelli Talk

Sun, 02 Aug 2015 19:00:01 GMT

In early september, we have another great visitor, [Andrea Baronchelli](https://sites.google.com/site/andreabaronchelli/), who's a faculty member at *City University London*. Andrea is a super-exciting young scientist with varied interests across complex systems science, and a collaborator on my recent *Sapere Aude* grant on [influence in social systems](http://ufm.dk/forskning-og-innovation/rad-og-udvalg/det-frie-forskningsrad/for-ansogere/hvad-kan-du-soge/sapere-aude/forskningsleder-portraetter/dff-forskningsleder-2015/sune-lehman). He will talk about a set of very cool experiment on the emergence of social conventions. Details below: - **Date**: September 3rd, 2015 - **Time**: 11am - **Location**: Technical University of Denmark, Building 321, 1st floor lab space - **Title**: The Spontaneous Emergence of Social Conventions: An Experimental Study of Cultural Evolution - **Abstract**: How do shared conventions emerge in complex decentralised social systems? This question engages fields as diverse as linguistics, sociology, and cognitive science. Previous empirical attempts to solve this puzzle all presuppose that formal or informal institutions, such as incentives for global agreement, coordinated leadership, or aggregated information about the population, are needed to facilitate a solution. Evolutionary theories of social conventions, by contrast, hypothesise that such institutions are not necessary in order for social conventions to form. However, empirical tests of this hypothesis have been hindered by the difficulties of evaluating the real-time creation of new collective behaviours in large decentralised populations. Here, I will present experimental results—replicated at several scales—that demonstrate the spontaneous creation of universally adopted social conventions and show how simple changes in a population’s network structure can direct the dynamics of norm formation, driving human populations with no ambition for large scale coordination to rapidly evolve shared social conventions. I will also show that a simple model describes well the experimental results on different classes of social networks. **References**: Experiment: D. Centola and A. Baronchelli, PNAS 112, 1989 (2015) Model: A. Baronchelli, M. Felici, V. Loreto, E. Caglioti and L. Steels, JSTAT P06014 (2006) (see also )

Jim Bagrow Visit and Talk

Mon, 15 Jun 2015 19:17:03 GMT

This week my good friend & collaborator [James Bagrow](http://bagrow.com/) ([assistant professor at University of Vermont](http://www.uvm.edu/~cems/?Page=employee/profile.php&SM=employee/_employeemenu.html&EmID=1096)) is visiting the group. He's an excellent speaker, and we're lucky enough that he's agreed to give a talk as part of his visit. If you're anywhere near Copenhagen, his talk is worth the trip out to DTU. Here are the details: - **Time**: Friday June 19th, 2015. 10:00am - **Location**: Technical University of Denmark, Building 321. First floor "Lab Space". If you need directions, click **[here](http://sunelehmann.com/directions/)**. - **Title**: Data-driven approaches to studying human dynamics - **Abstract**: Research on human dynamics and computational social science has been revolutionized by new data taken from online social networks. These modern datasets capture activity patterns across very large populations. Using these records, new results have been discovered and existing hypotheses have been tested. But what is the fundamental limit of social information stored in these data? These data also have sampling biases and other issues that make uncertainty quantification crucial. Along these lines, I will discuss current projects related to inferring hidden structure in partially observed networks and using large-scale Twitter data to estimate how information is stored and flows through social networks. (And Vedran Sekara's PhD defense is that same afternoon).

Tracking Human Mobility using WiFi signals

Tue, 26 May 2015 09:27:07 GMT

When I started working on understanding social systems, privacy really wasn't on my mind. (I generally want to write down equations, understand the universe and all that). But one of the central realizations arising from our [SensibleDTU experiment](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095978) is that privacy needs to be an important part of this kind of research. I've written about this at length [elsewhere](http://sunelehmann.com/2014/12/11/some-thoughts-on-privacy-part-0/). One of the things we noticed while digging into terabytes of social data is that data-channels are highly correlated. Information "bleeds through" ... something which has serious implications for privacy. Case in point: My group has just released a new preprint ([get it here](http://arxiv.org/abs/1505.06311 "Link to paper")) that *shows how the WiFi information routinely collected by your smartphone can easily be converted to precise information about your location*. WiFi routers reveal where you live, work, and spend your leisure time. While your phone may have told you that WiFi helps “improve location accuracy”, it may come as a surprise that - A majority of apps in the store have access to the list of routers around you (scanned every 20 seconds). - Your Android smartphone by default scans for WiFi routers even if you disable WiFi. Our research shows - How to easily convert WiFi information into geographical position. - That although it sounds like all WiFi scans might be a lot of data to process, your mobility can be described using just a few of access points. **And we have built an Android app** **which only requires WiFi data** to illustrate how this works for your own mobility: [Download here](https://play.google.com/store/apps/details?id=dk.dtu.compute.mywifiscanner "Check out the App!"). - That if someone knows these routers at some point in time, they will still know a lot about your mobility six months later. Thus, while WiFi networks are intended for enabling connectivity, they are also a *de facto* location tracking infrastructure. More generally, our world is becoming more enclosed in a web infrastructures supporting communication, mobility, payments, and advertising. Logs from mobile phone networks (call detail records, CDRs) constitute a global database of human mobility and communication networks. Credit card records form high-resolution traces of our spending behaviors. ![The figure shows 48 hours of location data of one of the authors, with the four visited locations visited marked in blue: home, two offices, and a food market. Even though the author](/assets/20150526_tracking_human_mobility_using_wifi_signals/subsampling_curves.png) The figure shows 48 hours of location data of one of the authors, with the four visited locations visited marked in blue: home, two offices, and a food market. Even though the author's phone has sensed 3,822 unique routers in this period, only a few are enough to describe the location more than 90% of time. (a) traces recorded with GPS; (b) traces reconstructed using all available data on WiFi routers locations - the transition traces are distorted, but all stop locations are visible and the location is known 97% of the time. (c) with 8 top routers it is still possible to discover stop locations in which the author spent 95% of the time. In this scenario transitions are lost. (d) timeseries showing when during 48 hours each of the top routers were seen. It can be assumed that AP 1 is home, as it's seen every night, while AP 2 and AP 3 are offices, as they are seen during working hours. The last row shows the combined 95% of time coverage provided by the top 8 routers. It is already a well know fact, that the so-called “WiFi scanners” can be [used to track individuals](http://www.sciencedirect.com/science/article/pii/S1574119214001953 "An example paper"). This is done by cities, airports, shopping centers, and advertisers (and perhaps intelligence agencies). Some OS manufacturers (e.g. Apple and Chainfire) have recently responded to such tracking by frequently randomizing the unique identifier of each phone. Randomizing the phone identifier, however, *does not address the threat presented in our work*—where data is collected by an application on the phone, not by external devices. The privacy of WiFi scan results is often overlooked. In the Android ecosystem the WiFi scans are not considered as a location signal. WiFi information can be collected by applications without location permission, do not show up in the overview of applications using location data, and the WiFi permission is not considered sensitive. This makes it possible for 3rd party developers to collect high-resolution mobility data under the radar, circumventing the policy and the privacy model of the Android ecosystem. Any app with just the WiFi permission can track your position, although they don't necessary do (there are legitimate reasons for applications to ask for WiFi permission, although this permission seems to be requested more often than required). Last time we checked (February 2015), 17 out of 20 top games on Android Play Store required access to your WiFi data; in only 6 of those 17 cases their privacy policy provided reasons why this information is required. For more information, email the paper's first author Piotr ([pisa@dtu.dk](mailto:pisa@dtu.dk)), who collaborated on this post. Or me ([sljo@dtu.dk)](mailto:sljo@dtu.dk)). The [preprint is available on arXiv](http://arxiv.org/abs/1505.06311 "Link to Paper"). ## Update June 3rd, 2015 (maybe-our-paper-played-a-role-in-this edition) Yesterday, while scouring Google I/O for details on the updated permissions (and to see if anyone mentioned our work), we found that a Google engineer (Ben Poiesz) was asked about the issue of WiFi tracking during the session discussing the new permission model. The session took place on May 29th - the clip is here: https://youtu.be/f17qe9vZ8RM?t=17m30s In the video, the friendly Google engineer notes that that - under the new system - apps without the location permission will no longer be able to see the mac addresses of WiFi and Bluetooth devices around ... *because that’s that’s equivalent to location*. No one is claiming (least of all us) that our work caused the change, but we would like to point out a couple of things about the way Google chose to announce it, which might indicate that the choice of fixing wifi is a recent decision on Google's part: - The published source code [find it [**here**](https://android.googlesource.com/platform/packages/apps/Settings/+/android-m-preview/src/com/android/settings/applications/AppOpsState.java "Source Code")] (lines 99-114) and documentation [find it [**here**](https://developer.android.com/preview/features/runtime-permissions.html "Documentation")] *do not yet* indicate that WiFi information is to be treated as location. - When you install the current Android M beta on your phone, our *"WiFi Watchdog"app still works* ... and WiFi is not treated as location. And a technical point: This it’s not just because of the “legacy mode” - according to the same presentation (): "WiFi Watchdog" should just receive empty data on Android M, but instead it continues to receive the same data as on Lollipop - The announcement of this *arguably major change* (80% of apps on the market would potentially be affected) was not a part of the main presentation ... but an answer during the Q&A session. Now, it is *probably just a coincidence*, and maybe a fix for the WiFi permissions has been in the works for months. But it’s quite striking that Google decided to fix wifi permissions 7 years after the existing scheme was introduced (and just days after we published our paper).

What it means to be a pro

Sun, 24 May 2015 09:02:09 GMT

I just love this quote which uses a Tiger Woods anecdote to illustrate what it means to be a professional. It's from [*The War of Art*](http://www.amazon.com/The-War-Art-Through-Creative/dp/1936891026 "The War of Art link") by Steven Pressfield (a great read, btw). > With four holes to go on the final day of the 2001 Masters (which Tiger went on to win, completing the all-four-majors-at-one-time Slam), some chucklehead in the gallery snapped a camera shutter at the top of Tiger’s backswing. Incredibly, Tiger was able to pull up in mid-swing and back off the shot. But that wasn’t the amazing part. After looking daggers at the malefactor, Tiger recomposed himself, stepped back to the ball, and striped it 310 down the middle. > That’s a professional. It is tough-mindedness at a level most of us can’t comprehend, let alone emulate. But let’s look more closely at what Tiger did, or rather what he didn’t do. > First, he didn’t react reflexively. He didn’t allow an act that by all rights should have provoked an automatic response of rage to actually produce that rage. He controlled his reaction. He governed his emotion. > Second, he didn’t take it personally. He could have perceived this shutterbug’s act as a deliberate blow aimed at him individually, with the intention of throwing him off his shot. He could have reacted with outrage or indignation or cast himself as a victim. He didn’t. > Third, he didn’t take it as a sign of heaven’s malevolence. He could have experienced this bolt as the malice of the golfing gods, like a bad hop in baseball or a linesman’s miscall in tennis. He could have groaned or sulked or surrendered mentally to this injustice, this interference, and used it as an excuse to fail. He didn’t. > What he did do was maintain his sovereignty over the moment. He understood that, no matter what blow had befallen him from an outside agency, he himself still had his job to do, the shot he needed to hit right here, right now. And he knew that it remained within his power to produce that shot. Nothing stood in his way except whatever emotional upset he himself chose to hold on to. That's something to aspire to.

Visitors this month

Sun, 05 Apr 2015 14:31:29 GMT

This month we have a two excellent of long-term visitors in the group. Visiting all month is [Ivan Brugere](http://compbio.cs.uic.edu/~ivan/) a graduate from [Tanya Berger-Wolff](http://compbio.cs.uic.edu/~tanya/)'s group at University of Illinois, Chicago. Ivan is interested in Spatiotemporal network mining, Network inference and prediction, and Social network privacy modeling. Stopping by between April 12th and April 18th is [Laura Allesandretti](http://lauraalessandretti.weebly.com/), who's a graduate student with [Andrea Baronchelli](https://sites.google.com/site/andreabaronchelli/) at City University London. Laura, Andrea and I are studying the long-term changes in individual and collective mobility patterns. In the literature, human mobility is typically described on a meta-stable time-scale, where mobility is characterized by regular patterns. We are interested in how this meta-stable regime evolves over long stretches of time (years). [![overal_network](/assets/20150405_visitors_this_month/overal_network-e1428237564779_665e4519.png)](/assets/20150405_visitors_this_month/overal_network-e1428237564779.png) Ivan & Laura will both be giving talks during their visits, so stay tuned for more info.

New Design

Sat, 04 Apr 2015 18:04:41 GMT

The header graphic (it's apparently called that even though it's displayed on the left) accompanying my updated site was created by [Andrea Cuttone](https://www.linkedin.com/in/andreacuttone), as PhD student in my group, using his cool python geo visualization toolkit [geoplotlib](https://github.com/andrea-cuttone/geoplotlib). It's a [Voronoi](http://en.wikipedia.org/wiki/Voronoi_diagram) construction based on data from the [Copenhagen Networks Study](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095978). The full image is displayed above.

Petter Holme

Wed, 04 Feb 2015 07:36:23 GMT

Emphasizing our focus on temporal networks, I am happy to announce that temporal networks czar, Petter Holme will visit the lab on Feb 18th. Petter is the author (with [Jari Saramäki](http://sunelehmann.com/2015/01/15/saramaki/), who visited last week) of the recent & excellent [review on temporal networks](http://www.sciencedirect.com/science/article/pii/S0370157312000841). He will be giving giving an talk, and if you're in the neighborhood, I highly recommend attending. - **Speaker:** Petter Holme. Associate Professor. Department of Energy Science. Sungkyunkwan University, Suwon Korea - **Date**. Feburary 18th, 2015 - **Time**. 14:00 - **Location**. DTU, Building 321, room 134 - **Title**: Temporal networks of human interaction **Abstract**: Since the turn of the millennium, networks have become a universal paradigm for simplifying large-scale complex systems, and for studying their system-wide functionalities. At the same time, there is considerable evidence that temporal structures, such as the burst-like behavior of human activity, affect dynamic systems on the network. These two lines of research come together in the study of temporal networks. Over the last five years, there has been a growing interest in how to analyze and model datasets in which we not only know which units interact (like in a traditional, static network), but also when the interactions take place. Just like static network analysis, the development of temporal network theory has been accelerated by the availability of new datasets. It should be noted that temporal networks are more than just extensions of static networks—they are e.g. (unlike simple, directed, weighted and multiplex networks) not transitive. In other words, if A and B are connected, and B and C are also connected, this does not imply that A and C are connected. Perhaps for this reason, temporal network theory has focused less on structural measures and studies of simple evolutionary models, and more on randomization studies and the simulation of spreading on empirical data. I will describe the state of the field, my own contributions (mostly about how temporal contact patterns affect infectious disease spreading), and discuss some future challenges.

Privacy Part II: Some examples of why privacy is important.

Sun, 01 Feb 2015 15:35:48 GMT

*[This is part II of a series, you can find an overview [here](http://sunelehmann.com/2014/12/11/some-thoughts-on-privacy-part-0/)]* There are many reasons why privacy is important. I will not try to cover them all here, but instead I have chosen two central topics, which I find particularly important. # "I have nothing to hide, so why should I care?" This one is a [classic retort](http://en.wikipedia.org/wiki/Nothing_to_hide_argument) against privacy advocates. It has been used by Google's then-CEO Eric Schmidt, who famously noted "If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place" during a [TV interview](http://www.pcworld.com/article/184446/googles_schmidt_roasted_for_privacy_comments.html). And on the surface, it looks like a pretty good one (one that I might have used a few years ago). **Keeping private does not imply wrongdoing** There are many reasons why the nothing-to-hide stance is problematic. To me, the central reason is that it pre-supposes that things we want to keep private are "secrets", the argument insinuates that a kind of wrongdoing is taking place whenever we want to keep something to ourselves. But imagine that you have just found out you and your significant other are pregnant. Maybe that's a piece of information that you would like to wait [approximately 12](http://www.babycenter.com/0_when-to-announce-your-pregnancy_10349769.bc) weeks before you share with the world? And maybe you would want to tell your close family before before announcing the news to a broader circle of friends? On a less cheery note, you might want to control how your surroundings learn about other big personal events, for example a serious disease, such as cancer. As another example consider someone who just fell in love. Not being able to control who knows about your deepest feelings could potentially be deeply embarrassing. These are not exactly dark secrets - just personal issues. But something that most people can understand, why we'd like to keep private. **Personal freedom is restricted** More generally, a world where all of your actions are known to everyone, becomes a world where personal freedom is restricted. I feel like I'm already experiencing this on e.g. Facebook, it looks like many present a curated, version of reality to the world, focusing mostly on positive aspects of their life (think photos of cute kids + delicious meals), while ignoring moments of doubt and insecurity. On Twitter, I know that what I say is persistent, so I usually avoid saying anything negative. In writing this post, I searched for "US torture war on terror" on Google and wondered if that would put me on som kind of watch list. Because there is a multitude of things that are completely legitimate, but that we might not want to share with everyone - we risk inhibiting ourselves whenever one of those topic come up in a "persistent medium". That also means that your freedom is particularly reduced if your personal preferences do not line up completely with main-stream social norms. For example, in a world where every action is know to everyone, young gay or transgender people might have a difficult time finding themselves (= even more difficult than now). **Nothing-to-hide and the government** "Trust is good, but control is better", as Lenin [probably said](http://www.zeit.de/stimmts/2000/200012_stimmts_lenin). If a government systematically collects data on its citizens, the nothing-to-hide discussion finds new nuances. Private information can be used as a means of control (e.g. via blackmail). Now, if your opponent has lots of ressources as well as access to a powerful legal system, this type of control is not limited to individuals with "something to hide". There are some great quotes on this. [Bruce Schneier](https://www.schneier.com/essays/archives/2006/05/the_eternal_value_of.html) points to Cardinal Richelieu who said "If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged." The russian dissident [Aleksandr Solzhenitsyn](http://chronicle.com/article/Why-Privacy-Matters-Even-if/127461/) said "Everyone is guilty of something or has something to conceal. All one has to do is look hard enough to find what it is". I tend to trust my government, so I'm not too worried about being blackmailed. As a pretty main-stream person, that's probably a good assessment of my situation. But what if you're in a minority? The United States has [tortured innocent people](https://firstlook.org/theintercept/2014/11/11/innocent-man-tortured-u-s-asks-wheres-accountability/) as part of the war on terror. Homosexual acts were illegal in the UK [until 1967](http://en.wikipedia.org/wiki/Labouchere_Amendment), and it can be argued that racial segregation in the US [persists in varying degrees to the present day](http://en.wikipedia.org/wiki/Racial_segregation_in_the_United_States). Additionally, much of the world is run by governments that are not democratic and whose choices and inner workings are not transparent to their citizens. Finally, even if you truly feel like sharing everything, there is a strong argument that *as a society* we want some people to have secrets. We want a free press with journalists that can protect their sources. Protected sources give citizens access to parts of society that we can otherwise never access - the criminal world - or [inside governments](http://en.wikipedia.org/wiki/Edward_Snowden). # The future: loss of self There is another argument for privacy. It's a a little more out here, but still central to the debate. The essence of the argument is that data collected about individuals can be used for other kinds of control than simply using blackmail. The next level is individualized manipulation. In order to get that part up and running, recall that there has been some very interesting developments in the behavioral science, such as cognitive psychology, social psychology, behavioral economy, etc. The term that embodies all of these developments is [nudging](http://en.wikipedia.org/wiki/Nudge_theory). The general idea is that, during human evolution, our brain has developed to make very quick decisions in a world that looks very different from our modern surroundings. Because of the need for milli-second speed, many of our decisions are not based on rational chains of thought, but on built-in heuristics. If you have a few hours to kill, check out of this list of known [cognitive biases](http://en.wikipedia.org/wiki/List_of_cognitive_biases). Nudging is essentially the practice of "hacking" these heuristics to manipulate human behavior - and can be used as a force for good (e.g. to recycle or promote saving for retirement) or in more questionable ways (e.g. to sell us stuff). One can imagine that data mining of personal data can be used to create personalized nudges. This is already happening to some extent - for example [people with macs are steered towards more expensive hotel rooms than windows users](http://www.wsj.com/articles/SB10001424052702304458604577488822667325882) on some sites. Clearly, humans have always been manipulating each other - just think back to last time you purchased a car. But algorithmic nudging is different. In part because it runs at scale, with a single company potentially reaching hundreds of millions of users, and in part because the nudging potentially can be much more precise and effective. We're not there yet, but the long term perspectives are terrifying. In a fascinating piece in the New York Times called "[Privacy and the Threat to the Self](http://opinionator.blogs.nytimes.com/2013/06/22/privacy-and-the-threat-to-the-self)", the philosopher Michael P. Lynch makes the case that complete loss of privacy effectively dehumanizes us and takes away our "self". He writes: > To get a sense of what I mean, imagine that I could telepathically read all your conscious and unconscious thoughts and feelings — I could know about them in as much detail as you know about them yourself — and further, that you could not, in any way, control my access. You don’t, in other words, share your thoughts with me; I take them. The power I would have over you would of course be immense. Not only could you not hide from me, I would know instantly a great amount about how the outside world affects you, what scares you, what makes you act in the ways you do. And that means I could not only know what you think, I could to a large extent control what you do. Here, Lynch - from another vantage point - discusses what we have covered above. The fact that knowing about people allow you to control them. But we begin to see that it's not just about blackmail, but also about *manipulation*. That's where the personalize nudging comes in. Knowing enough allows you to accurately "read people's mind" or at least anticipate their actions. He continues to say: > That is the *political* worry about the loss of privacy: it threatens a loss of freedom. And the worry, of course, is not merely theoretical. Targeted ad programs, like Google’s, which track your Internet searches for the purpose of sending you ads that reflect your interests can create deeply complex psychological profiles — especially when one conducts searches for emotional or personal advice information: Am I gay? What is terrorism? What is atheism? If the government or some entity should request the identity of the person making these searches for national security purposes, we’d be on the way to having a real-world version of our thought experiment. In the second paragraph, Lynch discusses another point that I have touched upon above. Surveillance is already happening. And the quote contains nice examples of how our online behavior may reveal lots of information about us, which is private in the sense that we might not want to share with everyone, but not something which implies any kind of wrongdoing. The final paragraph goes into why all this implies a loss of self: > But the loss of privacy doesn’t just threaten political freedom. Return for a moment to our thought experiment where I telepathically know all your thoughts whether you like it or not From my perspective, the perspective of the knower — your existence as a distinct person would begin to shrink. Our relationship would be so lopsided that there might cease to be, at least to me, anything *subjective* about you. As I learn what reactions you will have to stimuli, why you do what you do, you will become like any other object to be manipulated. You would be, as we say, dehumanized. If someone knows everything about you, and can manipulate you to their whim, you cease to be a human being. If that's not scary, I don't know what is. Now, we're still far from this scenario (last time I bought a pair of sunglasses on line, I encountered pointless ads for sunglasses on most sites for months after) . In fact, it's not clear to me that we will ever get to the point where we can accurately make [predictions on the actions of single human beings](http://everythingisobvious.com/). [BTW another great article on the topic, from a more practical standpoint can be found [here](https://gigaom.com/2014/02/22/data-privacy-machine-learning-and-the-destruction-of-mysterious-humanity/)]. But I hope that I've made a case that privacy is something that is so important that we should all be discussing it.

Saramäki

Thu, 15 Jan 2015 15:45:18 GMT

Later this month, we're lucky to have Jari Saramäki visiting and speaking at the lab. Jari is an expert on temporal networks (I highly recommend the [excellent review paper on temporal networks that Jari co-authored with past and future guest of the lab, Petter Holme](http://www.sciencedirect.com/science/article/pii/S0370157312000841)). Jari is an associate professor at Aalto University and a highly cited author of many high impact papers on complex networks, for example: - Jari Saramäki, E. A. Leicht, Eduardo López, Sam G. B. Roberts, Felix Reed-Tsochas, and Robin I. M. Dunbar. *Persistence of social signatures in human communication.* PNAS **111** (3) 942-947 (2014). - Lauri Kovanen, Kimmo Kaski, János Kertész, and Jari Saramäki. *Temporal motifs reveal homophily, gender-specific patterns, and group talk in call sequences.* PNAS **110** (45) 18070-18075 (2013). - J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási. *Structure and tie strengths in mobile communication networks.*PNAS **104** (18) 7332-7336 (2007). Below are the details of his talk. # From minutes to months: call network dynamics at multiple timescales - **Date**: 27th of January - **Time**: 14:00 - **Location**: Auditorium 040 in DTU Building 324. Abstract: Big Data on human interactions and communication have revolutionized the ways how human behaviour can be approached from a quantitative point of view. Mobile telephone Call Detail Records (CDR's) have proven especially fruitful for understanding one-to-one communication patterns and the dynamics of inferred social networks. I will discuss what happens and when in call networks constructed from CDR's with time stamps; this talk can be considered a mini-review of what we know about temporal networks of mobile telephone calls. I will begin with short timescales and fast dynamics (such as burstiness of sequences of calls between individuals) and "zoom out", from temporal motifs formed of correlated calls between multiple individuals to long-term dynamics of personal networks of individuals.

Privacy Part I: Why everyone is complaining, but no one is taking action.

Thu, 15 Jan 2015 09:48:56 GMT

*[This is part II of a series, you can find the overview [here](http://sunelehmann.com/2014/12/11/some-thoughts-on-privacy-part-0/)]* We all have a sense that privacy is important. A sense that our ability to freely express "who we are" is slowly eroded by large corporations and governments collecting data on our actions for purposes not clear to us (and maybe not to them either). But on the other hand, no one is doing anything about this. Why is that? I think that there are two central reasons for this. The first reason is that humans are not very good at handling situations where cause and effect is separated by a lots of time and space. (I was made aware of this point by [an excellent column in the Guardian](http://www.theguardian.com/technology/2012/mar/02/censorship-inseperable-from-surveillance) by the author Cory Doctorow, who I will be stealing from in the following). There are lots of examples of this: No one would smoke if you developed cancer immediately upon the first drag of a cigarette. The possibility of cancer is so far away in time that it feels like the consequences happen to another person. You would be less likely to binge-eat if the food immediately was converted into belly flab. Something similar is going on with privacy. You don't immediately notice any problems when you hand over all of your email correspondence to Google or outsource your social network to Facebook, or signing up for free Airport Wifi. And it's even worse because we don't even know what the consequences of sharing those data might be. Figuring out what we can learn about individuals is an emerging field. And while we know that you can estimate e.g. [your political views based on your Facebook feed](http://www.pnas.org/content/110/15/5802.full.pdf+html), we're still working out what it really is that you're revealing about yourself, when you're sharing data ... and how those pieces of information can be used to manipulate you. In fact, the entities that know most about what your personal data can be used for (e.g. Google, Facebook, Apple, Amazon), have very little incentive to share this information with the general public. In part because opening up can damage their public image (e.g. the [Facebook mood experiment](http://www.theatlantic.com/technology/archive/2014/06/everything-we-know-about-facebooks-secret-mood-manipulation-experiment/373648/)), and in part because sharing insights might run counter to shareholder interests (e.g. making tons of $$$ manipulating people) [1]. This is why the kind of research that we do here at DTU is so important - providing a public and open counterpoint to large corporation with private research divisions. The second (and even more important) reason is that it is not at all clear what kind of action we should take. Privacy is such a complex issue that even if you want to take action, there is no obvious path to follow. To make such a path a little bit clearer is, my goal with the posts in this series. I'll try to find a little bit of solid ground so that maybe we'll have something to mobilize around once we feel like it's time to take action. ## Notes [1] Thanks to [Piotr Sapiezynski](https://twitter.com/sapiezynski) for making this point.

Some thoughts on privacy. Part 0

Thu, 11 Dec 2014 09:25:15 GMT

Recently I've been thinking a lot about privacy. My own work focuses on [what we can learn from dense data collected by volunteers at my university (DTU)](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0095978), and that means that privacy is something I think about a lot. What we're learning from our amazing dataset shows that data channels are highly overlapping and even something like which WiFi access points your phone sees (sounds innocuous, right?) reveals pretty much everything about both your movement around in space as well as your social connections. Because or our findings, I am concerned, for example, by the City of Copenhagen's [decision to monitor everyone in the city using WiFi routers](http://copenhagenconnecting.com/_include/img/work/full/TM82B2_-_praesentation_CC.pdf). It's not trivial to me that it is OK for the city to perform this kind of monitoring. More generally, the further we move into a future, where we haven't set down simple ground rules for what's OK and what is not OK, the more difficult it will be to find our way again. Overall, however, I think sharing data is a great idea and usually data-sharing is a win-win proposition. But we have to make sure that we have rules that ensures the right balance of power between individuals and the entities that use their data. Inspired by [Clive Thompson's thoughts on public thinking](http://educatorinnovator.org/the-power-of-public-thinking-an-interview-with-author-clive-thompson/), I've decided to write a few posts about privacy and data on this blog even though I haven't really figured out what to think about all aspects of the topic yet. Thus, in the coming weeks (probably months, knowing my tendency to procrastinate), I'll be writing about privacy here. A tentative outline of the series is: - [Part I: Why everyone is complaining, but no one is taking action.](http://sunelehmann.com/2015/01/15/privacy-part-i-why-everyone-is-complaining-but-no-one-is-taking-action/) - [Part II: Some examples of why privacy is important.](http://sunelehmann.com/2015/02/01/privacy-part-ii-some-examples-of-why-privacy-is-important/) - Part III: Why technical solutions will not work. - Part IV: Suggestion for simple rules for data. - Part V: Sharing and electronic traces present an even deeper problem. I'll present a sketch of a solution. - Part VI: Why all this does not mean you should not share your data. It's generally a great idea to share with data both corporations and governments. Maybe also something about why companies with reasonable data policies will have a competitive advantage.

#turkeyface

Thu, 27 Nov 2014 19:41:43 GMT

As part of our twitter bot experiment, my class on *social graphs and interactions* tried use our robots to start a #turkeyface trend on Twitter for thanksgiving. Looking at these photos, I simply don't get why this meme didn't take off :) ![](/assets/20141127_turkeyface/b3ztv6vccaasmf5.png) ![](/assets/20141127_turkeyface/b3zpk6xiaaa9zrz.jpg) ![](/assets/20141127_turkeyface/b3zob0jccaazvtt.jpg) ![](/assets/20141127_turkeyface/b3z1nyuiyairxpp.jpg) ![](/assets/20141127_turkeyface/b3z1ln-imae_4q9.jpg) ![](/assets/20141127_turkeyface/b3yzmeoimaeqwfu.jpg) ![](/assets/20141127_turkeyface/b3ywhzmciaaweec.png) ![](/assets/20141127_turkeyface/b3yr4oaieaezxsi.jpg) ![](/assets/20141127_turkeyface/b3ypygaccaaqnmv.jpg) ![](/assets/20141127_turkeyface/b3yjkamiyaa5i2j.jpg) ![](/assets/20141127_turkeyface/b3yhogqceaadcs6.jpg) ![](/assets/20141127_turkeyface/b3yhkcwciaavl9h.png) ![](/assets/20141127_turkeyface/b3yg2kzcuaasheb.png) ![](/assets/20141127_turkeyface/b3yfqfxieaasu9s.jpg) ![](/assets/20141127_turkeyface/b3yepmdcuamyae.jpg) ![](/assets/20141127_turkeyface/b3ydoiocmaa3y4i.jpg) ![](/assets/20141127_turkeyface/b3y9zxvicaikflo.jpg) ![](/assets/20141127_turkeyface/b3y4d2lcqaama-x.jpg) ![](/assets/20141127_turkeyface/b3y3z06cqaawibj.jpg) ![](/assets/20141127_turkeyface/b3y-6lziyaix0gb.jpg) ![](/assets/20141127_turkeyface/b3dj4wscyaaal4p-large.png) ![](/assets/20141127_turkeyface/b3bvwnximaaupt2.jpg) ![](/assets/20141127_turkeyface/b3athnwcyaawhdh.png) ![](/assets/20141127_turkeyface/b3akdiaiuaetszp.jpg)

Talks next weeks

Thu, 02 Oct 2014 20:34:21 GMT

It's not just the network structure that we care about. We want to understand network structure in order to get a handle on processes taking place on networks. That kind of processes is what next week's two exciting (Monday and Tuesday October 6th and 7th, at 11am @ DTU) talks focus on. Both talks are open to the public, so I hope you'll join us if you're in Copenhagen. Full details here: ## Cornelia Betsch on Vaccination Decision Making - **Time:** Monday, October 6th, 2014 - **Place:** Technical University of Denmark, Building 321, 1st floor Lab Space - **Title:** Vaccination decision making – an individual and social perspective - **Speaker:** Dr. Cornelia Betsch. PD Dr. Cornelia Betsch is research fellow (Akademische Oberrätin) and scientific manager of the Center for Empirical Research in Economics and Behavioral Science (CEREB) at the University of Erfurt, Germany. She serves as a member of the European Technical Advisory Group of Experts on Immunization (ETAGE) of the WHO Europe and as a member of the German Commission for the Verification of Measles and Rubella Elimination(Federal Ministry of Health @ Robert Koch Institute). - **Abstract**: The desperate search for a vaccine against Ebola currently reminds us on the merits and value of vaccination. Still, there is a small but critical amount of parents and adults who decide not to vaccinate their children or themselves. They endanger public health goals such as the elimination of diseases like measles or polio. In this talk I will show from the individual perspective what may influence a decision against vaccination. Further, I will analyze the vaccination decision from a structural point of view and show the social perspective of vaccination decision making: as many vaccinated individuals can protect some unvaccinated individuals, it may be rational to forego vaccination and to free ride. Given we know something about how people make vaccination decisions, which strategies should we choose for vaccine advocacy? In the final part of the talk I will give some examples and link them to real-world challenges of vaccine communication. ## Jens Koed on Describing the psychology of argumentation - **Time:** Tuesday, October 7th, 2014 - **Place:** Technical University of Denmark, Building 321, 1st floor Lab Space - **Title:**Describing the psychology of argumentation, reasoning, and persuasion from a Bayesian perspective - **Speaker:** Jens Koed Madsen (Postdoc @ Birkbeck, University of London) - **Abstract**: Classical psychological models of persuasion and reasoning (Chaiken, 1980; Petty & Cacioppo, 1981) conceptualise rationality from the perspective of formal logical reasoning. Empirically, however, humans do not respond in line with logical predictions, as many fallacious arguments are accepted, and not all valid arguments are accepted. This has led to the conclusion that humans are not rational and to the development of the dual-process theory (consisting of a slow, laboured, and logical and a shallow, heuristic, and non-logical system). Recently, rationality has been recast as reasoning from uncertainty rather than reasoning from certainty from a Bayesian perspective (Oaksford & Chater, 2007). The paradigm has successfully been applied to reasoning (e.g. Oaksford & Chater, 2007), argumentation (e.g. Hahn & Oaksford, 2006; 2007), fallacies (e.g. Corner et al., 2011; Harris et al., 2012), persuasion (Madsen, 2013), and has integrated source credibility in a reasoning framework (Hahn et al., 2009; Harris et al., submitted). I work on three aspects of Bayesian persuasion: the conceptual development of the persuasion model from the thesis (Madsen, 2013), the psychological ontogenesis of probabilistic estimations, and the relationship between individualised approaches to belief changes and behaviour changes. These aspects touch upon the modelling, theoretical foundation, and application of the Bayesian approach developed in the past decade. **Bibliography for Jens' talk** Chaiken, S. (1980) Heuristic versus systematic information processing and the use of source versus message cues in persuasion, Journal of Personality andSocial Psychology 39, 752-766 Corner, A., Hahn, U. & Oaksford, M. (2011). The psychological mechanism of the slippery slope argument. Journal of Memory & Language, 64, 133-152. Hahn, U., Harris, A. J. L., & Corner, A. (2009). Argument content and argument source: An exploration. Informal Logic, 29, 337-367. Hahn, U. & Oaksford, M. (2006a) A Bayesian Approach to Informal Reasoning Fallacies. Synthese 152, 207-23 Hahn, U., & Oaksford, M. (2007a) The rationality of informal argumentation: A Bayesian approach to reasoning fallacies, Psychological Review 114, 704-732 Hahn, U., Oaksford, M., & Harris, A. J. L. (2012). Testimony and argument: A Bayesian perspective. In F. Zenker (Ed.), Bayesian Argumentation (pp. 15-38). Dordrecht: Springer. Harris, A. J. L., Hahn, U., Madsen, J. K. & Hsu, A. S. (submitted) The Appeal to Expert Opinion: Quantitative support for a Bayesian Network Approach, Cognitive Science, XXX, xxx-xxx Madsen, J. K. (2013) Prolegomena to a Theory and Model of Persuasion Processing: A Subjective-Probabilistic Interactive Model of Persuasion (SPIMP), unpublished thesis, University College London Oaksford, M. & Chater, N. (2007) Bayesian Rationality: The probabilistic approach to human reasoning. Oxford, UK: Oxford University Press. Petty, R. E. & Cacioppo, J. T. (1981) Attitudes and persuasion: Classic and contemporary approaches, Boulder, CO: Westview Press

How to kill a Twitter Bot!

Wed, 10 Sep 2014 13:20:15 GMT

This friday we're lucky to have visitor [Emilio Ferrara](http://www.emilio.ferrara.name/) presenting a talk on identifying twitter bots. Emilio's work has been covered extensively in the media, for example MIT Technology Review's *[How to spot a social bot on twitter](http://www.technologyreview.com/view/529461/how-to-spot-a-social-bot-on-twitter/).*Details below: - ***Date***: Friday September 12th, 2014 - ***Time***: 11:00-noon - ***Place***: DTU Building 321, first floor lab space - ***Speaker***: Emilio Ferrara ([@jabawack](https://twitter.com/jabawack)), [Post-doctoral Research Fellow at Indiana University Bloomington](https://www.linkedin.com/profile/view?id=53440344) - ***Title***: The rise of social bots: fighting deception and misinformation on social media - ***Abstract***: One of the classic problems in Computer Science, recognizing the behavior of a human from that of a computer algorithm (proposed by Alan Turing), has suddenly become very relevant in the context of social media. Limits to the expressive power of humans and real incentives abound to develop human-mimicking software agents called social bots. These elusive entities wildly populate social media ecosystems, often going unnoticed among the population of real people. Bots can be harmful, aiming at persuading, smearing, or deceiving, and for such a reason our research aims at developing efficient systems to detect them. In my talk I will discuss the characteristics of modern, sophisticated social bots, and how their presence can endanger online ecosystems and our society. Characteristics related to content, network, sentiment, and temporal patterns of activity are imitated by bots but at the same time can help discriminate synthetic behaviors from human ones, yielding signatures of engineered social tampering. I will present "Bot or Not?", a social bot detection framework prototype developed at Indiana University under the Truthy project. My talk will conclude depicting future scenarios and discussing related problems, such as that of studying persuasion campaigns on social media, how they spread, and how we can promptly detect and potentially hinder their diffusion.

Dynamic and Multiplex Networks

Tue, 26 Aug 2014 14:46:17 GMT

Network science buffs are in for a treat this Monday (September 1st, 2014), when we have a great set of visitors in my Group at DTU. I'm excited to present talks on the cutting edge on what we know about networks from *János Kertész* and *Janos Török*. The talks will be back to back and detailed info can be found below The talks are open to the public, so hope to see you there! - **Time**: Monday September 1st, 10am - noon - **Place**: DTU Building 321, room 134 (1st floor lab area). - **Speakers**: - [János Török](http://www.phy.bme.hu/~torok/) (10am-11pm). Associate professor at Budapest University of Technology, Department of Theoretical Physics. - [János Kertész](http://www.phy.bme.hu/~kertesz/) (11am-12pm). Professor & Director of the Institute of Physics, Budapest University of Technology and Economics ***Multi-level, multi-channel, multi-agent modeling of social interactions*** ([János Török](http://www.phy.bme.hu/~torok/)) **Abstract**: We present a model of society. Human relations are strengthened by communication and eroded by time. Communication is, in general, related to some social activity (work, friendship, hobby) or social context. Therefore we postulate that individuals having different social needs participate in a number of social contexts (family, workplace etc.) - which may also evolve in time - and communicate with other members of the contexts using different communication channels (face to face, phone, email, etc.) for different purposes and with different impact on their relationship. We show that using realistic input data from surveys and statistical data one can reproduce important features of real society like Dunbar's numbers and their meaning. ***Spreading on temporal networks: Results from empirical analysis, model calculations and simulation***([János Kertész](http://www.phy.bme.hu/~kertesz/)) **Abstract**: Spreading phenomena typically take place on temporal networks, where connections between the nodes are only occasionally and for limited time present. Such events can be, e.g., encounters of people, which are important for contagion or opening a communication channel needed for information transmission. We studied a mobile call network from this point of view: Having the time stamped records of the calls we played a ‘susceptible-infected’ game by infecting one node at random and assuming transmission at every possible event. We introduced different reference systems by appropriate shuffling of the data and identified this way the contributions of the different types of correlations to the speed of spreading. We concluded that there is a considerable slowing down as compared to the random models, mainly due to the correlations between the link weights and the topology and the inhomogeneous, bursty character of the events. We have also shown that the temporal inhomogeneity cannot be characterized by the inter-event time distribution (IETD) alone as there are strong dependencies between the events. In order to understand better the role of the different components we investigated models of temporal networks. In the analytically solvable infinite complete graph we showed that burstiness, i.e., power law IETD distribution always accelerates the process provided the clocks are positioned on the nodes. For the complementary case of link related burstiness we considered a number of models, like the analytically tractable Cayley tree, BA trees and networks. We show that if the stationary bursty process is governed by power-law IETD, the spreading can be slowed down or accelerated as compared to a Poisson process; the speed is determined by the short time behavior, which in our model is controlled by the exponent. We demonstrate that finite, so called "locally tree-like" networks, like the Barabási-Albert networks behave very differently from real tree graphs if the IETD is strongly fat-tailed, as the lack or presence of rare alternative paths modifies the spreading. A further important result is that the non-stationarity of the dynamics has a significant effect on the spreading speed for strongly fat-tailed power-law IETDs, thus bursty processes characterized by small power-law exponents can cause slow spreading in the stationary state but also very rapid spreading heavily depending on the age of the processes. **References**: 1. M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, J. Saramäki: Small But Slow World: How Network Topology and Burstiness Slow Down Spreading, Phys. Rev. E 83, 025102 (2011) 2. Márton Karsai, Kimmo Kaski, Albert-László Barabási, János Kertész: Universal features of correlated bursty behavior, Scientific Reports 2, Article number 397 (2012) 3. Márton Karsai, Kimmo Kaski, János Kertész: Correlated dynamics in egocentric communication networks, PLoS ONE 7(7) e40612 (2012) 4. Hang-Hyun Jo, Márton Karsai, János Kertész, Kimmo Kaski: Circadian pattern and burstiness in human communication activity, New J. Phys. 14 013055 (2012) 5. Szabolcs Vajna, Bálint Tóth, János Kertész: Modelling power-law distributed interevent times, New J. Phys.15, article 103023 (2013) 6. Hang-Hyun Jo, Juan I. Perotti, Kimmo Kaski, János Kertész: Enhanced Spreading Dynamics by Non-Poissonian Processes, Physical Review X 4, 011041 (2014) 7. Dávid X. Horváth, János Kertész: Spreading dynamics on networks: the role of burstiness, topology and non-stationarity, New Journal of Physics 16 (7), 073037

Demo of SensibleDTU data explorer

Tue, 19 Aug 2014 22:19:39 GMT

As part of a Master's project, Marta Magiera, a student in my group has developed a great tool for visualizing geo-data. Check it out below (looks best in 720p)

A note on academic writing

Wed, 30 Jul 2014 16:42:42 GMT

I often give the following writing advice to my students. Today, in honor of efficiency, I decided I'd put my advice in a blog post, so I can just link to it in the future. Unless you're a great writer (in which case you don't have to follow any rules), the structure of academic text is the following: - First you tell your readers what you're about to tell them. - Then you tell the readers the thing you want to tell them. - Finally you tell them what you've just told them. This structure works on a number of levels in a thesis. On the level of the entire thesis, the introduction tells the reader what's going to happen in the text and the conclusion summarizes what just happened, while the chapters in between contain the actual work. But for each chapter, you should also put an introduction and conclusion around the content, and similarly for each section. Even within each subsection, it might be good idea to start with a introductory sentence or two (setting the stage) and wrapping up. You have to stop before it gets too pedantic, but I hope the point gets across. It's not exactly fractal, but almost.

Networking doesn't always work

Sat, 28 Jun 2014 04:06:33 GMT

With collaborators at MIT (first author is [Yves-Alexandre de Montjoye](http://www.demontjoye.com/ "Yves")) we have just published a paper in Scientific Reports, [The Strength of the Strongest Ties in Collaborative Problem Solving](http://www.nature.com/srep/2014/140613/srep05277/full/srep05277.html "Paper"). The paper shows that networking (in the sense of building a larger network of weak ties) does not improve team performance under some circumstances. We showed that for teams of knowledge workers in a competitive environment, the strongest ties (best friends or people you spend a lot of time with) explain much of the team performance in our statistical model. Said differently, a team’s strongest ties are the best predictor of how the team will perform. They predict performance better than any other factors we looked at such as the technical abilities of its members, how knowledgeable they are about the topic at hand, and even their personality. In fact, once you account for a team’s strongest ties none of these other factors matters. A neat infographic (created by Yves) explains the main findings and shows some of the key plots. ![](http://web.media.mit.edu/~yva/InfographicStrongestTies.png)

Public Lecture

Mon, 19 May 2014 12:27:18 GMT

I'm giving a public lecture tomorrow, Tuesday May 20th at 8pm. It'll be at *Byens Lys* on Christiania. Full details are available here: http://www.scienceandcocktails.org/2014/ComplexNetworks.html

Dirk Brockmann Visit

Fri, 21 Feb 2014 12:50:26 GMT

Next week, we're very lucky to have [Dirk Brockmann](http://rocs.hu-berlin.de/) visiting the lab. If you're anywhere near Copenhagen there's no excuse not to come and see him. He's a world class scientist (see below) and in addition to mind-expanding content, his talks often feature subtle humor, as well as legendary slideshows. - **Date**: Thursday, February 27th - **Time:** 13:30 - 14:30 - **Location**: [Technical University of Denmark, Building 324, Room 040](https://goo.gl/maps/DLCpl) [if you're traveling from Copenhagen, I recommend Bus 150S] - **Title**: The hidden geometry of complex, network driven contagion phenomena - **Abstract**: See below. Dirk is theoretical physicist turned world expert in spreading patterns of contagious disease. His recent *Science* paper [The hidden geometry of complex, network-driven contagion phenomena](http://www.sciencemag.org/content/342/6164/1337)[Science **342**, 1337 (2013)] shows that it's possible to replace geographic distance by a probabilistically motivated effective distance which reveals a hidden geometry where disease arrival times can be accurately predicted. [![Screenshot 2014-02-21 12.18.09](/assets/20140221_dirk_brockmann_visit/screenshot-2014-02-21-12-18-09_f08f5f5d.png)](/assets/20140221_dirk_brockmann_visit/screenshot-2014-02-21-12-18-09.png) He's also done interesting work on human travel patterns based on how money travels, [Scaling laws of human travel](http://www.nature.com/nature/journal/v439/n7075/full/nature04292.html) [*Nature* **439**, 462-465 (2006)] [![Screenshot 2014-02-21 12.22.22](/assets/20140221_dirk_brockmann_visit/screenshot-2014-02-21-12-22-22_c3c6fca3.png)](/assets/20140221_dirk_brockmann_visit/screenshot-2014-02-21-12-22-22.png) Finally Dirk's work has been used to fight crime on the US hit TV series [NUMB3RS](http://en.wikipedia.org/wiki/Numbers_(TV_series)) - check out the action packed clip below. https://dl.dropboxusercontent.com/u/153071/numbers.mp4 [Dirk is a professor at](https://dl.dropboxusercontent.com/u/153071/numbers.mp4) [Humboldt University](http://rocs.hu-berlin.de/) (recently returned from [Northwestern University](http://rocs.northwestern.edu/)).

The paper itself is Big Data

Fri, 14 Feb 2014 08:42:26 GMT

With 29 pages of text and 9 pages of references, the new paper [we've just put on arXiv](http://arxiv.org/abs/1401.7233) is almost big data in its own right (ok, not quite, but it's still a nice, big chunk of work). The paper outlines all the work we've done over the past couple of years to put together a great big testbed for network science, working to collect a multiplex dataset (face-to-face, telecommunication, social networks, geospatial- and demographic information) of around 1000 densely connected individuals. [![Networks](/assets/20140214_the_paper_itself_is_big_data/screenshot-2014-02-14-08-31-51_6bb3fe1b.png)](/assets/20140214_the_paper_itself_is_big_data/screenshot-2014-02-14-08-31-51.png) The abstract reads *This paper describes the deployment of a large-scale study designed to measure human interactions across a variety of communication channels, with high temporal resolution and spanning multiple years - the Copenhagen Networks Study. Specifically, we collect data on face-to-face interactions, telecommunication, social networks, location, and background information (personality, demographic, health, politics) for a densely connected population of 1,000 individuals, using state-of-art smartphones as social sensors. Here we provide an overview of the related work and describe the motivation and research agenda driving the study. Additionally the paper details the data-types measured, and the technical infrastructure in terms of both backend and phone software, as well as an outline of the deployment procedures. We document the participant privacy procedures and their underlying principles. The paper is concluded with early results from data analysis, illustrating the importance of multi-channel high-resolution approach to data collection.* Get it here:

PostDoc opportunity

Wed, 12 Feb 2014 12:21:06 GMT

[PostDoc opportunity](http://www.dtu.dk/english/Research/Research-at-DTU/HC-Oersted-Postdoc---COFUND "PostDoc opportunity ") If you're a PhD student or young PostDoc interested in a "curiosity-driven, bottom-up research project" in my lab, the [Ørsted Postdoc positions linked here](http://www.dtu.dk/english/Research/Research-at-DTU/HC-Oersted-Postdoc---COFUND "PostDoc") are a great opportunity. Let me know, and we can consider designing a project together. And don't forget, the Danish PostDoc salaries are great. [![logo_marie-curie](/assets/20140212_postdoc_opportunity/logo_marie-curie_7223ef54.jpg)](/assets/20140212_postdoc_opportunity/logo_marie-curie.jpg)

Some years ago,...

Wed, 01 Jan 2014 15:50:38 GMT

> Some years ago (...), I said that our networked future was bracketed by the dystopian nightmares of two old-Etonian novelists, George Orwell and Aldous Huxley. Orwell thought we would be destroyed by the things we fear, while Huxley thought that we would be controlled by the things that delight us. What Snowden has taught us is that the two extremes have converged: the NSA and its franchises are doing the Orwellian bit, while Google, Facebook and co are attending to the Huxleyean side of things. From "Here's how data thieves have captured our lives on the internet" by John Naughton in the Guardian/the Observer. [[link here](http://www.theguardian.com/commentisfree/2013/dec/29/internet-corporate-surveillance)]

Influential Bots!

Fri, 06 Dec 2013 21:40:46 GMT

A couple of days I wrote (with Piotr Sapieżyński) about our influential Twitterbots [[click here to read more](http://sunelehmann.com/2013/12/04/youre-here-because-of-a-robot/)]. We know the bots are great at getting followers, but what about other measures of influence? Today, on a whim, I checked the bots' [Klout](http://klout.com) scores, and was both surprised & impressed. I'm fairly certain that the bots have higher Klout scores than most readers of this page! (Let me know in the comments if I'm wrong). And they only been tweeting for approx 90 days. For reference, my personal Klout score is currently 48, and I've been posting on Twitter since 2008, and have literally thousands of tweets to my name. What's even more impressive is that the bot Klout-scores are calculated based on Twitter alone. My own score is also partly based on contributions from Facebook and LinkedIn. Check out the gallery of bot Klout-scores by clicking on the images below. ![](/assets/20131206_influential_bots/48_olli.png) ![](/assets/20131206_influential_bots/47_mary.png) ![](/assets/20131206_influential_bots/46_jinxy.png) ![](/assets/20131206_influential_bots/44_jessica.png) ![](/assets/20131206_influential_bots/42_amanda.png) ![](/assets/20131206_influential_bots/42_david.png) ![](/assets/20131206_influential_bots/42_jennifer.png) ![](/assets/20131206_influential_bots/42_justin.png) ![](/assets/20131206_influential_bots/42_max.png) ![](/assets/20131206_influential_bots/40_madison.png) ![](/assets/20131206_influential_bots/40_steve.png) ![](/assets/20131206_influential_bots/37_madison.png) ![](/assets/20131206_influential_bots/35_noah.png) ![](/assets/20131206_influential_bots/33_ryan.png) I suppose this says more about Klout's algorithm than about the bots' actual influence, but it's still an interesting tidbit.

You're here because of a robot

Wed, 04 Dec 2013 11:17:33 GMT

**Note**: This post is co-written with Piotr Sapieżyński *Is it possible for a small computer science course to exert measurable influence (trending topics) on Twitter, a massive social network with hundreds of millions of users?*The surprising answer to that question is "yes". That's exactly what we did this year, using simple Python scripts and the Twitter API. Below we explain why & how + some of our findings along the way. ### Why Twitter bots? The standard (spam)bot on Twitter has almost no followers, almost zero activity, and exist for a single simple purpose, for example to increase follower counts for certain individuals. For this year's *Social Graphs and Interactions* course, we wanted to do something different - we wanted to see how "intelligent" we could make our bots, using simple machine learning and network analysis methods ([the topics covered in the class](http://www.kurser.dtu.dk/02805.aspx?menulanguage=en-gb)). [![The class](/assets/20131204_you_re_here_because_of_a_robot/photo_af966c5e.jpg)](/assets/20131204_you_re_here_because_of_a_robot/photo.jpg) A large part of our motivation for investigating Twitter bots in class is that the amount of manipulation that humans are experiencing on line is ever increasing. Think, for example, about how Facebook's time-line filtering algorithm shapes the world view of hundreds of millions around the globe. And that's just the most main stream example. Instead of simply pointing out this fact, we thought that investigating how relatively simple bots can interact with and influence a real social system would be an interesting way for the students in our class to become aware of (and potentially counteract) some of those manipulations. ### Some basic findings Our first finding was that getting followers on Twitter is surprisingly easy! By employing a simple strategy, which takes advantage of a tendency for most users to "follow back" if followed by someone (who looks like a not-too-spammy Twitter profile). The recipe is simply 1. Manually create a realistic profile, including a few tweets 2. Pick users with between 50 and 300 followers (people with high numbers of followers are less likely to follow back). 3. Follow about 100 new users per day. 4. Unfollow whoever doesn't follow you back within 24 hours (because users with a very asymmetrical ratio of follow***ing*** to follow***ers*** look like spam-bots). 5. Repeat steps 2-4. Our initial target in the class was to get at least 50 followers that way, but in a relatively short time period the most successful bots had gained thousands of followers! Below is a plot of number of followers as a function of time for some of the most successful teams (for the first 50 days of the class). [![Followers for 4 bots](/assets/20131204_you_re_here_because_of_a_robot/followers_2d5846c3.png)](/assets/20131204_you_re_here_because_of_a_robot/followers.png) These large numbers of followers (along with systematic interactions added later) also translated to [very high Klout-scores](http://sunelehmann.com/2013/12/06/influential-bots/ "Influential Bots!"). Aside: In the beginning of the course, we focused on Justin Bieber followers, here's a snapshot from a report describing an early avatar [![Bieber Snippet](/assets/20131204_you_re_here_because_of_a_robot/screenshot-2013-11-14-13-35-05_a301b749.png)](/assets/20131204_you_re_here_because_of_a_robot/screenshot-2013-11-14-13-35-05.png) The next finding was that some teams in the class more or less inadvertently connected their bots to large "dark matter" components of the Twitter network, very large systems of spambots posting meaningless content and following back immediately in an automated fashion. We did not explore these parts of the network in detail, but we note in passing that such areas are highly interesting for actual research, as they may create significant noise for analytics skewing results for algorithms working to [predict the stock market](http://arxiv.org/abs/1010.3003) or [box office revenue](http://arxiv.org/abs/1003.5699) based on the Twitter firehose. The Twitter dark matter may create lots of noise on twitter, but are great for getting lots of followers quickly, and many followers is a key part of a convincing Twitter persona, as many Twitter users tend to think that someone with thousands of followers must have something interesting to say. ### Social influence As the course progressed, we focused on creating bots that could use machine learning to recognize "good" content for tweeting and retweeting. Bots that are able to detect topics within their tweet-stream ... and distinguish between real, human accounts and robots among their followers. However, the question remained: Can those thousands of followers be converted to influence on Twitter? For the class' final project, we decided to put that to the test. The overall goal was to for each team to build a convincing bot, get human followers, and at a specified time, for everyone work together to make specific hashtags trend on twitter. So how to achieve that goal? Here's an overview of what each team has worked on - Build convincing avatars and use the high follower-counts as part of the disguise. - Use machine learning to tell who's a bot and who's not (in order to focus only on humans and ignoring bots). - Use natural language processing & machine learning to discover quality content to re-tweet and tweet. - Use network theory, to explore the network surrounding existing followers, making sure that bot actions reach entire communities. Trending topics are defined for geographical areas. Since Copenhagen is not very active in the twitter-verse (sadly, Copenhagen does not have trending topics on Twitter), we chose Boston, MA (where both of us have lived) for the experiments. Thus all bots were located in Boston (both terms of in profile text and tweet-geotags) and tweet on an East Coast time table. Specifically, the bots started following people located in Boston based on self-reported language, location, profile description and geo-tagged tweets. By the end of a three week period, more than 800 individuals in Boston followed at least one of our bots. ![](/assets/20131204_you_re_here_because_of_a_robot/network_growth.png) Days are counted from the beginning of the the class' final project. The sudden drop in friends/followers corresponds to one, popular bot being banned for a few days In addition to each bot's idiosyncratic strategy for following new Twitter users, the bots maintained a shared list of Bostonians who had already followed (back) one of the bots. The idea being that if you follow more than one bot, content from the consortium of bots will increase proportionally in your Twitter stream. ![Bostonians following N bots. Again, days are counted from the beginning of the final project.](/assets/20131204_you_re_here_because_of_a_robot/intensity_growth.png) Bostonians following N bots. Again, days are counted from the beginning of the final project. With all that in place, we tried three distinct interventions, ordered by what we perceived to be increasing potential for virality. Each intervention consisted of a couple of manual tweets per bot and coordinated/automated re-tweeting and favoriting. The first hashtag #bostonthanks was designed to be an unusual (so as to be specific to our intervention) thanksgiving greeting that we hoped would become one of the chosen thanksgiving greetings for Boston. It didn't really take off. The idea behind the second hashtag #MeInThree was to start a hashtag that would allow people to describe themselves in three words/concepts (something that is fun and fits within Twitter's 140 characters). That didn't work either. The third hashtag #banksyinboston was designed around secretive British artist and prankster, Banksy, who travels the globe pseudonymously and interacts with the world through movies, graffiti, and happenings. The idea was to create a couple of primitive "fake" Banksy artworks and start a #banksyinboston discussion "Is he really here?" in what we hoped would be the spirit of Banksy himself. Boston also has original Banksy art, which might add to the discussion. [![banksy](/assets/20131204_you_re_here_because_of_a_robot/dsc_0978_e846d992.png)](/assets/20131204_you_re_here_because_of_a_robot/dsc_0978.png) Above is one of the crudely made Banksy fakes with background image of Trinity church from Google Street View: note the artifact in the upper left edge of the photo. (Interestingly [another and much more elaborate Banksy-in-Paris](http://www.huffingtonpost.com/2013/12/02/banksy-in-paris-website-fake_n_4373478.html) hoax/non-hoax started the same day!) Much to our surprise (after two failed attempts), the third attempt actually succeeded in our stated goal of influencing the trending topics on Twitter!! [![Trendsmap.com](/assets/20131204_you_re_here_because_of_a_robot/trending_banksyinboston_388617be.png)](/assets/20131204_you_re_here_because_of_a_robot/trending_banksyinboston.png) We did fall short of trending on Twitter's own website, but #banksyinboston managed to get to the top of the trending list for the competing site [trendsmap.com](http://trendsmap.com "Trendsmap"). Analyzing the subsequent cascade of tweets reveals a couple of interesting things. Firstly, existing Boston graffiti was indeed re-discovered. Secondly, and most importantly, many Bostonians were highly effective in discovering that #banksyinboston was indeed a prank and spreading the word, here's one example. https://twitter.com/jhaley617/status/407586256881987585 And much of the discussion related to the #banksyinboston was dedicated to putting the notion to rest. [This echoes the behavior observed during the London Riots in 2011](http://www.theguardian.com/uk/interactive/2011/dec/07/london-riots-twitter). ### Perspectives So at the end of the day, did the Twitter bots have influence in Boston? We stress that this is an anecdotal test and only the most viral hashtag made it to the trending list. With a little over 800 Bostonian followers, our bots did not infiltrate Boston - and most Twitter users in Boston never interacted with one of our bots. But what we did show was that a few dedicated bots can make a difference. In four weeks, we managed to put together a small network with substantially more impact than a single individual with a similar number of followers. Most importantly, someone with more time & resources could easily put together a much larger system of coordinated bots that - in terms of advertisement - could be used to gently boost interest in an upcoming movie/similar. Or - with malevolent intent - could use a network of "sleeper bots" to systematically spread mis-information, e.g. injecting talking points into Twitter streams on a global scale. We hope that this little experiment can be helpful in creating awareness of such subtle manipulations before they begin shaping our public conversations. ### Appendix: Twitter bots - what are those!? Here, we provide a bit of context on Twitter bots. The earliest recorded document (that we could find) on Twitter bots is a great Ignite talk by [Tim Hwang](http://brosephstalin.com "Tim Hwang") from way back in 2009. https://www.youtube.com/watch?v=HdxOr5MaJok Some of the ideas in Tim's talk were later tested [by the web ecology project](http://www.webecologyproject.org/2011/01/help-robots-take-over-the-internet-the-socialbots-2011-competition/ "Web Ecology"), and in a [class at University of Washington](http://gumption.typepad.com/blog/2011/06/socialbots-2-artificial-intelligence-social-intelligence-and-twitter.html), and recently bots have received lots of attention in the tech and business press (e.g [Wall Street Journal [Inside a Twitter Robot Factory](http://online.wsj.com/news/articles/SB10001424052702304607104579212122084821400)], [The Atlantic [Why Did 9,000 Porny Spambots Descend on This San Diego High Schooler?]](http://www.theatlantic.com/technology/archive/2013/11/why-did-9-000-porny-spambots-descend-on-this-san-diego-high-schooler/281773/)).

Dynamic Network Talks

Sun, 17 Nov 2013 13:06:11 GMT

This Thursday (Nov 21st) from 14-16, we're delighted to present two exciting talks on dynamic & complex networks. [Tanya Berger-Wolf](http://compbio.cs.uic.edu/~tanya/) from University of Illinois at Chicago will discuss collective dynamics in the social network of primates, and [Joachim Mathiesen](http://www.nbi.dk/~mathies/) from the Niels Bohr Institute will talk about [excitable dynamics on Twitter](http://www.pnas.org/content/110/43/17259.short). **Location**: DTU, Building 306, Room 97 (First floor). **Time**: November 21st, 14:00-16:00 **Speaker**: Tanya Berger-Wolf (Associate Professor University of Illinois at Chicago) **Title**: Animals as Mobile Social Users **Abstract**: Recent advances in data collection technology, such as GPS and other mobile sensors, high definition cameras, and UAVs, have given biologists access to high spatial and temporal resolution data about animal populations. Many of the questions biologists are asking while trying to leverage those data are similar to questions being asked about mobile users. Why do animals go here rather than there? How does location influence activity and social interactions? How do social interactions influence activity and movement choices? How are movement decision being made in a group and individually? While some of the methodology for answering those questions has been developed for understanding human behavior, animals offer the advantage of visible and trackable interactions and movements, simpler context and rules of behavior, and no privacy issues. I will present examples of the recent developments from the mobile world of animal populations, show some of the methodology we have developed for understanding their mobile social networks, and discuss the challenges for understanding these kinds of data, common to all animals, including humans. **Bio**: Dr. Tanya Berger-Wolf is an Associate Professor in the Department of Computer Science at the University of Illinois at Chicago, where she heads the Computational Population Biology Lab. Her research interests are in applications of computational techniques to problems in ecology and population biology of plants, animals, and humans, from genetics to social interactions. As a legitimate part of her research she gets to fly in a super-light airplane over a nature preserve in Kenya, taking a hyper-stereo video of zebra populations. Dr. Berger-Wolf has received her Ph.D. in Computer Science from University of Illinois at Urbana-Champaign in 2002. After spending some time as a postdoctoral fellow working in computational phylogenetics and doing research in computational epidemiology, she returned to Illinois. She has received numerous awards for her research and mentoring, including the US National Science Foundation CAREER Award in 2008 and the UIC Mentor of the Year (2009) and Graduate Mentor (2012) awards. **Speaker**: Joachim Mathiesen (Associate Professor, Niels Bohr Institute) **Title**: Excitable human dynamics driven by extrinsic events in massive communities **Abstract**: Online social networks are emphatically a global phenomenon which has changed the way people interact. Using data from Twitter and on trading volumes of financial securities, we analyze the correlated human activity in massive social organizations. The activity, typically excited by real-world events and measured by the occurrence rate of international brand names and trading volumes, is characterized by intermittent fluctuations with bursts of high activity separated by quiescent periods. These fluctuations are broadly distributed with an inverse cubic tail and have long-range temporal correlations with a 1/f power spectrum. We describe the activity by a stochastic point process and derive the distribution of activity levels from the corresponding stochastic differential equation. The statistical properties of the systems that we consider have similarities with a wide range of social systems and might therefore provide insight into general human behavior in large social organizations.

Public talk at Danish Royal Academy

Tue, 17 Sep 2013 09:51:17 GMT

If you're in Copenhagen, if you speak Danish, and if you're not already an expert on networks, I've got a public lecture on Complex Networks coming up at the Danish Royal Academy. The abstract is > Der er netværk overalt. Dybt inde i vore celler regulerer generne hinanden som del af et komplekst netværk. Tegner man et billede af, hvilke dyr der spiser hinanden i fødekæden, får man et komplekst netværk. Mennesker forbinder til hinanden via sociale netværk, der udspænder sig henover personlig kontakt, telefonopkald, online sociale netværk, osv. Og på global skala udgør Wikipedia, internettet og vores samlede viden et komplekst netværk. Siden årtusindskiftet er der sket et skred i vores forståelse af disse netværk og et nyt felt, netværksvidenskab, er i færd med at opstå. I foredraget fortæller jeg historien om moderne netværksteori, forklarer nogle af de nye indsigter og slutter af med at binde an til min egen forskning i sociale netværk. It's a fancy place and the building itself is worth a visit. ***If you're interested, you must sign up to attend***, [follow this link](http://www.royalacademy.dk/Offentlige-foredrag.aspx?ID=7&&CalendarEventID=104 "Sign up") to sign up. [Update, sept 30]: Here's the video (in Danish) http://www.youtube.com/watch?v=dvO\_EcTPq9I (Looks best in 720p.)

Case Closed

Tue, 20 Aug 2013 19:59:45 GMT

This morning, when I entered my office I was greeted by hundreds of Nexus 4 cases. Yesterday [our team](https://www.sensible.dtu.dk/?page_id=1096 "great team") of hardworking, thoughtful & highly intelligent grad students were wasting their years of education, doing sweat-shop level work at an undisclosed location in the greater Copenhagen area. Their task: rooting & flashing more than one thousand Nexus 4 phones for our SensibleDTU project. Due to creative hacking, the amount of time spent was about 9 hours. Being a (relatively) old man with little/no staying power, I left a bit early to let the rest of the team finish up. Little did I know that they saved about 1/3 of the nexus outer boxes, hauled them back to DTU and used them to decorate my office. The result is shown below :) ![](/assets/20130820_case_closed/img_0031.jpg) ![](/assets/20130820_case_closed/img_0032.jpg) ![](/assets/20130820_case_closed/img_0033.jpg) ![](/assets/20130820_case_closed/img_0034.jpg) ![](/assets/20130820_case_closed/img_0035.jpg) ![](/assets/20130820_case_closed/img_0036.jpg) ![](/assets/20130820_case_closed/img_0037.jpg) ![](/assets/20130820_case_closed/img_0038.jpg)

Open Master's Projects

Mon, 19 Aug 2013 21:46:27 GMT

We've just posted a number of exciting new master's projects starting fall 2013 at the SensibleDTU site. Interested students should **[click here](https://www.sensible.dtu.dk/?page_id=1424 "Positions")** to go directly to see the projects.

Complex Adaptive Systems

Sat, 13 Jul 2013 19:25:16 GMT

For my summer reading I'm aiming to catch up on the classics. It's all about finding the sweet spot of books that make you smarter, but aren't too boring to bring to the beach. For me that's usually an Elmore Leonard novel, or - if I can muster the discipline (and that pretty much never happens) - something by a more literary writer. This year, however, I've been extra good and have been reading "Complex Adaptive Systems" by Miller & Page. It's an introduction to computational models of social systems, and contains a great set of reflections on the use of computation as a tool for building theory + surveys a lot of work I never got around to reading. I'm not sure I agree with the authors on everything in the book (not really a fan of agent based models in the age of big data), but one section that resonated me is their *Physics envy: A pseudo-freudian analysis*. They start out the section by writing: > During the late nineteenth century, various “cargo cult” societies emerged in the South Pacific. By the mid-twentieth century, inspired by their experiences during World War II, these societies built elaborate mock facilities, such as airstrips and control towers, in hopes of attracting deliveries of goods similar to those that colonial officials once received. Like these societies, we suspect that much of the current view and apparatus of theory in economics is based on misinterpreted observations and misplaced hopes. So the tone is set. This sounds a lot like the kind of old-school physicists who [tend to see themselves as the lords of the academic jungle](http://sitearm.wordpress.com/2012/03/16/here-come-the-physicists-by-duncan-j-watts/ "Brilliant Duncan Watts quote"). Being a kind of physicist myself, that's how I grew up, and there's a certain familiarity to these arguments that reminds me of the halls of the Niels Bohr Institute ... and generates a warm, fuzzy feeling when reading stuff like this. My own views have matured a bit since those days, and I'm more in line with Cosma Shalizi's view (as argued elegantly [here](http://vserver1.cscs.lsa.umich.edu/~crshalizi/weblog/347.html "Networks and netwars") and [here](http://arxiv.org/abs/adap-org/9910002 "A Simple Model of the Evolution of Simple Models of Evolution")) that physicists have a bad habit of trying to take up new subject-matter and not learning what's already known about it ... Thus, it's a fun start (for a physicist) to see Miller & Page openly - and without any hesitation - comparing economics to a cargo cult. And there is a more serious point coming up next > There is a commonly held perception in economics that its approach to theorizing closely follows the “one” that is used in physics. Indeed, at certain levels, modern economic theory does resemble some parts of physics, where a small set of well-formulated mathematical models is applied to a broad spectrum of the world. However, based on our interactions at the Santa Fe Institute with a fine group of theoretical physicists, we find that this narrow view of theoretical work is far too restrictive to capture either the reality or the potential of what other fields like physics have to offer in terms of ways to approach theoretical questions in the social sciences. > Theoretical physicists are concerned with, and rewarded by, finding insights about nature through the creation of models and the generation of hypotheses. **The emphasis here is on understanding nature, not on the tools used to gain this understanding**. [my emphasis] This, I think, hits the nail on its head. That's precisely what I think is great about physics. And this emphasis on insight and disregard for choice of methodology is the trait that does distinguishes physicists from scientists from many of the other fields (at least in my experience). A clarification is in order here (also just so I don't sound too much like one of those old-school physicist myself). The thing that Miller & Page focus on is the insight, but I think that's missing the point a little bit. Every scientific field is about insight - and the search for new insights had better be common to every field of science. *What's special about physics might just be the disregard for the method with which the insight was obtained.* I honestly don't know other fields (or philosophy of science for that matter) well enough to know if what I'm about to say is true (but random thoughts are what this blog is for, and comments are welcome), but my sense is that *most fields are defined primarily defined via the methods they use*. For example, if I want to publish in a computer science journal, I have to actually develop new theory that connects to the existing theory in computer science, using the methods of computer science - my sense is that this pattern is true more generally. The point is drawn even clearer as Miller & Page continue > The premium in theoretical physics is on gaining insight into interesting phenomena. If the insight is there, then there is little desire for mathematical rigor. Consequently, in physics there is a sharp distinction between the mathematical and theoretical branches. Having a good insight and stating a theorem that is not rigorously proved is acceptable behavior. Once, during a talk at the Santa Fe Institute, a well-known theoretical physicist was asked if he could rigorously prove a proposition that he had just made, and his answer was “No, and I don’t need to, but I’m sure someone can." On first hearing by most economists, this seemingly casual approach to scientific theory is scandalous at best; yet, ultimately it becomes a very productive way to make scientific progress. > While axiomatic rigor is not required for theoretical work in physics, there is still a high premium on good theory—**just not on the tools used to develop the theory**. Theory must result in insight and withstand testing. [my emphasis] And perhaps I need another caveat here. I readily concede that are plenty of 'conservative' physicists who tend to deem everything that's not strictly-within-the-boundaries-of-classical-physics as "non-physics"; and who would not think Page & Miller's work is 'real physics'. And those same guys might very well be critical towards methods appropriated from other fields, if you were lucky enough to have them review your paper in a physics journal. But that being said, I do think there's a tendency for physics to be more inclusive of new methods as long as they get the job done. If I were to name a reason, I'd say maybe it's because an education in physics already includes very wide a variety of mathematical tools? What I'm working on these days probably won't be classified as physics by most people, but what a background in physics has provided me, is not just a focus on insight over everything else, but a lack of respect for *how* to build a good theory. As long as you employ the rules of logic & common sense, as long as your work can be reproduced by others, and as long as you're able to convince most rational beings that you're right, the details of how you got there are less important.

New SensibleDTU graphic

Wed, 19 Jun 2013 12:49:47 GMT

New PhD student in my lab [Piotr Sapieżyński](http://dk.linkedin.com/pub/piotr-sapie%C5%BCy%C5%84ski/23/1a9/253) has been working on creating a new logo/header for SensibleDTU and I think what we've ended up with is pretty awesome! The progression neatly illustrates the power of iteration. The idea for even updating the old header (shown below) [![sensible_header](/assets/20130619_new_sensibledtu_graphic/sensible_header_dfcf9359.png)](/assets/20130619_new_sensibledtu_graphic/sensible_header.png) originated from the image below that we found Tue Herlau's [personal homepage](http://www2.imm.dtu.dk/~tuhe/). [![Starting point](/assets/20130619_new_sensibledtu_graphic/tue_bcab939f.png)](/assets/20130619_new_sensibledtu_graphic/tue.png) That thing was just too cool, so we begged Tue for the script, and he agreed to share. But we also wanted to introduce a few changes such as DTU-red (153,0,0) letters on a colorful background. And reduce the "noise" a bit. So the first version looked like this: [![proto1](/assets/20130619_new_sensibledtu_graphic/proto1_6148b58e.png)](/assets/20130619_new_sensibledtu_graphic/proto1.png) Not bad, but definitely room for improvement. We decided to get rid of the square nodes, play around with node-sizes, increase density of nodes inside the letters, and try out colors nabbed from the [WWDC13 logo](https://devimages.apple.com.edgekey.net/wwdc/images/wwdc13-icon.jpg "WWDC") for the "background" network. [![proto2](/assets/20130619_new_sensibledtu_graphic/proto2_d7b3adc6.png)](/assets/20130619_new_sensibledtu_graphic/proto2.png) This looks great - and I was happy! But at this point Piotr was simply unstoppable. Below is the final version, which I think is just gorgeous. [![final](/assets/20130619_new_sensibledtu_graphic/final_b69fcfa9.png)](/assets/20130619_new_sensibledtu_graphic/final.png) This time the changes are more subtle. Node borders have been removed. The edge colors are now the official DTU-grey (153,153,153). Simultaneously, the height has been reduced to produce a more compact look, and the background nodes now change color continuously across a bright rainbow-ish spectrum (no longer WWDC13 colors). Click each image to see a larger version - and as always, comments & questions are welcome.

NetSci 2013: Nine reasons to go (all of them)

Thu, 14 Mar 2013 11:44:52 GMT

Phew. [There's only one day to go before the call for abstracts closes](http://netsci2013.net/wordpress/call-for-papers/ "CFP"), but with a [burst](http://www.amazon.com/Bursts-Hidden-Patterns-Everything-Crusades/dp/B0064XDKTA "Bursts") of posts in the last couple of days, I finally managed to complete the list of 9 reasons to visit NetSci this year. This post summarizes and lists all 9 reasons (although it'd be great if you could read them all) - hoping to convince you that you absolutely need to head on over and [submit a 1-page abstract](http://netsci2013.net/wordpress/call-for-papers/ "CFP") on your most recent fun result ... so you can join in on the fun. The very best reason to go, is all the interesting, intelligent, and creative people you will meet in Copenhagen. - [Reason 1. The venue](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)"). Check out the post - the pictures speak for themselves. - [Reason 2. The NetSci school](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)"). Amazing speakers (headlined by A-L Barabasi) and great tutorials on NetworkX, Gephi, and Mathematica. - [Reason 3. Brand new types of talks](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)"). In addition to the good old fashioned talks, we're also doing plenary [Ignite](http://igniteshow.com "Ignite") Sessions and Young Investigator talks. - [Reason 4. Publisher session](http://sunelehmann.com/2013/03/08/netsci-2013-nine-reasons-to-go-reason-4/ "NetSci 2013: Nine reasons to go (Reason 4)"). We'll have a session where publishers from NPG and APS (among others) will tell you how to write the papers they want to see. - [Reason 5. The superb list of satellite workshops](http://sunelehmann.com/2013/03/11/netsci-2013-nine-reasons-to-go-reason-5/ "NetSci 2013: Nine reasons to go (Reason 5)"). Speaks for itself. - [Reason 6. The NetSci Dinner](http://sunelehmann.com/2013/03/12/netsci-2013-nine-reasons-to-go-reason-6/ "NetSci 2013: Nine reasons to go (Reason 6)") will be delicious and feature fine wines. - [Reason 7. The snarky/silly reason](http://sunelehmann.com/2013/03/13/netsci-2013-nine-reasons-to-go-reason-7/ "NetSci 2013: Nine reasons to go (Reason 7)"). My (partially failing) attempt to be funny with movie references. (Help from Mason Porter improved this post.) - [Reason 8. The updated NetSci look](http://sunelehmann.com/2013/03/13/netsci-2013-nine-reasons-to-go-reason-8/ "NetSci 2013: Nine reasons to go (Reason 8)") is beautiful and due to Chair of Design, Isabel Meirelles (and Petter). - [Reason 9. Great group of invited speakers](http://sunelehmann.com/2013/03/14/netsci-2013-nine-reasons-to-go-reason-9/ "NetSci 2013: Nine reasons to go (Reason 9)"). Who will entertain you and make you smarter while they're at it. ![Don](/assets/20130314_netsci_2013_nine_reasons_to_go_all_of_them/cute-puppy.jpg) Don't forget to submit an abstract And check out the [brand new poster](https://dl.dropbox.com/u/153071/poster_NetSci2013.pdf "NetSci Poster") for more info + don't forget to retweet, +1, put on mailing lists, and so on.

NetSci 2013: Nine reasons to go (Reason 9)

Thu, 14 Mar 2013 08:38:05 GMT

The 9th and final reason is an obvious one (but after all, probably still the best reason for attending the conference). ***And don't forget that the call for abstracts closes tomorrow!*** So [head over and submit an abstract](http://netsci2013.net/wordpress/call-for-papers/ "CFP") (if you haven't done so already). ## Reason 9: Great list of invited speakers This year we've managed to convince an amazing group of keynotes and invited speakers to participate. I can't wait to see all of them - and hear what they've been up to since last year! Here's the list (in the official NetSci design). [![speakers](/assets/20130314_netsci_2013_nine_reasons_to_go_reason_9/speakers_6fc0cc80.png)](/assets/20130314_netsci_2013_nine_reasons_to_go_reason_9/speakers.png) And that's it folks, this concludes the series - and don’t forget to check out the other reasons to visit NetSci 2013 if you haven't seen them yet. - [Reason 1. The venue](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") - [Reason 2. The NetSci school](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)") - [Reason 3. Brand new types of talks](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)") - [Reason 4. Publisher session](http://sunelehmann.com/2013/03/08/netsci-2013-nine-reasons-to-go-reason-4/ "NetSci 2013: Nine reasons to go (Reason 4)") - [Reason 5. The superb list of satellite workshops](http://sunelehmann.com/2013/03/11/netsci-2013-nine-reasons-to-go-reason-5/ "NetSci 2013: Nine reasons to go (Reason 5)") - [Reason 6. The NetSci Dinner](http://sunelehmann.com/2013/03/12/netsci-2013-nine-reasons-to-go-reason-6/ "NetSci 2013: Nine reasons to go (Reason 6)") - [Reason 7. The snarky/silly reason](http://sunelehmann.com/2013/03/13/netsci-2013-nine-reasons-to-go-reason-7/ "NetSci 2013: Nine reasons to go (Reason 7)") - [Reason 8. Updated NetSci look](http://sunelehmann.com/2013/03/13/netsci-2013-nine-reasons-to-go-reason-8/ "NetSci 2013: Nine reasons to go (Reason 8)")

NetSci 2013: Nine reasons to go (Reason 8)

Wed, 13 Mar 2013 19:32:49 GMT

We're nearing the end of the list of reasons to go to NetSci ([and the call for abstracts ends this friday, so go and submit something!](http://netsci2013.net/wordpress/call-for-papers/ "Go submit ASAP (if you haven't done so already)")). This 8th reason is not so much a specific reason to go to Copenhagen. Rather, it is an example of how we've tried to rethink & rework every aspect of the conference, how we've done our best to make it the best conference we possibly could. ## Reason 8: Re-designed graphics & logo While I love the classic NetSci design pictured below, Petter and I agreed that it might be time to update the NetSci look. [![Screen Shot 2013-03-13 at 7.06.40 PM](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-06-40-pm_98f71207.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-06-40-pm.png) Along with chair of design, Isabel Meirelles, we set out to update the look. The first draft was a bit simpler than what we ended up with, but amazing all the same. Take a look:[![First draft](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-09-15-pm_1e1d2ac7.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-09-15-pm.png) The motivation came from Isabel, based on a visual idea from Sebastian Ahnert's talk at the NetSci 2012, she noted *Inspired by the sketch tool by Sebastian, I am proposing the overlapping circles rather than the wave in previous editions of the conference.* *The colors could change every year, either using the flag colors of the hosting country (e.g., 2013 Denmark, 2014 USA, 2015 Germany, etc.) or just arbitrary (e.g., 2016).* *The font (Bauhaus) was selected for the way letters seem to connect in this very geometric fashion.* The next iteration was a bit more complex, but the central idea remains. [![Second Draft](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-18-01-pm_1adc3de7.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-18-01-pm.png) This is the version we went with - and if you look closely, you'll see that the colors in this year's logo represent the colors seen in the scandinavian flags. The logo itself also went through many iterations of input from Petter and Isabel, here are some of the intermediate versions [![logo a](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-26-09-pm_d409be04.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-26-09-pm.png)[![logo b](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-25-53-pm_83387eb5.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-25-53-pm.png)Isabel's rationale on the final design is worth reproducing in full: *Inspired by the sense of community that is NetSci, I am proposing a new logo that communicates**this feature. It combines a visual mark with its verbal description.* [![final logo](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-30-07-pm_f8435cd0.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-30-07-pm.png) *The visual mark is a play with the commonly used diagrammatic representation for close connection within a community. The logo starts and ends with the highly connected community, in that the red**serving as the dot for the “i” is the same as the white dot inside the larger circle. It is also a play with figure/ground perception. Ultimately, I hope that the displacement of the red dot gives a sense of motion, both metaphorically and literally. The color red is a remainder of the previous logo.* *The typeface Bauhaus was selected for the way letters seem to connect in a very geometric fashion.**I used different type weights to provide a more dynamic display while also facilitating readability of the acronym for those not familiar with it. The logo might have its full description or be used without it.**I think it can depend on the material being published.* *The web site banner shows the logo in addition to the dates, the image of the venue, and overlapping circles. The overlapping circles represent the several communities participating in the conference. It was inspired by the sketch tool devised by Sebastian Ahnert and presented at this year’s NetSci. The color palette for the circles reflects the colors of the country or region hosting the event. This year, the colors are those of Scandinavian flags, with the more obvious reference to Denmark. The palette should change every year accordingly.* [![final](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-44-26-pm_34b5a35a.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_8/screen-shot-2013-03-13-at-7-44-26-pm.png) Stay tuned for the remaining reason – and don’t forget to check out the other reasons to visit NetSci 2013. - [Reason 1. The venue](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") - [Reason 2. The NetSci school](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)") - [Reason 3. Brand new types of talks](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)") - [Reason 4. Publisher session](http://sunelehmann.com/2013/03/08/netsci-2013-nine-reasons-to-go-reason-4/ "NetSci 2013: Nine reasons to go (Reason 4)") - [Reason 5. The superb list of satellite workshops](http://sunelehmann.com/2013/03/11/netsci-2013-nine-reasons-to-go-reason-5/ "NetSci 2013: Nine reasons to go (Reason 5)") - [Reason 6. The NetSci Dinner](http://sunelehmann.com/2013/03/12/netsci-2013-nine-reasons-to-go-reason-6/ "NetSci 2013: Nine reasons to go (Reason 6)") - [Reason 7. The snarky/silly reason](http://sunelehmann.com/2013/03/13/netsci-2013-nine-reasons-to-go-reason-7/ "NetSci 2013: Nine reasons to go (Reason 7)")

NetSci 2013: Nine reasons to go (Reason 7)

Wed, 13 Mar 2013 18:47:39 GMT

When I started the "Nine reasons" series of posts, I had planned a single Letterman top-ten style "silly" reason. I'd forgotten, however, that the general NetSci audience is a clever bunch, so I was a little bit disappointed (but also impressed) when I posted "Reason 1", and Mason Porter saw that one coming a mile away. In a comment, [he noted](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/#comment-655 "Mason Porter") *No Letterman-style ‘Top Ten’ list in the traditional reverse order?**And one or two of the reasons should be snarky, of course.**E.g. ‘Reason number 5: “Three words: Zachary Karate Club” ‘* Beating me to the punch (and with a better joke). My original silly reason was *Petter and I paid for everything using my mom's credit card, so we* *need your registration money or I'll be grounded for NetSci 2014.* [But I kind of revealed that one in my answer to Mason](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/#comment-656 "Answer"), so I've had to come up with new material. ## Reason 7: Ok, the snarky/silly reasons. Here's what I have so far - the jokes are pretty weak, so please forgive me. [Rules of NetSci](http://www.diggingforfire.net/FightClub/ "FightClub"): - *First rule of NetSci. Submit an abstract to NetSci.* - *Second rule of NetSci. SUBMIT an abstract to NetSci.* - *Third rule of NetSci: If someone says "stop" or goes limp, taps out the talk is over.* - *Fourth rule of NetSci: Only approximately one person to a talk.* - *Fifth rule of NetSci: One talk at a time (unless it's a parallel session).* - *Sixth rule of NetSci: Wear a shirts, and shoes.* - *Seventh rule of NetSci: Talk will stay strictly within time limits.* - *Eighth rule of NetSci: If this is your first NetSci, you HAVE to talk.* Another good reason to go is that on the first day of a NetSci conference, [Laszlo can't deny any request](http://refspace.com/quotes/The_Godfather "Godfather quote"). (*But some day, and that day may never come, he may call upon you to do a service for him. But until that day, consider his justice a gift on the first day of NetSci*). [![Godfather](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_7/screen-shot-2013-03-13-at-6-51-43-pm_c34658a3.png)](/assets/20130313_netsci_2013_nine_reasons_to_go_reason_7/screen-shot-2013-03-13-at-6-51-43-pm.png) Stay tuned for the remaining two (serious) reasons - and don’t forget to check out the other reasons to visit NetSci 2013. - [Reason 1. The venue](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") - [Reason 2. The NetSci school](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)") - [Reason 3. Brand new types of talks](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)") - [Reason 4. Publisher session](http://sunelehmann.com/2013/03/08/netsci-2013-nine-reasons-to-go-reason-4/ "NetSci 2013: Nine reasons to go (Reason 4)") - [Reason 5. The superb list of satellite workshops](http://sunelehmann.com/2013/03/11/netsci-2013-nine-reasons-to-go-reason-5/ "NetSci 2013: Nine reasons to go (Reason 5)") - [Reason 6. The NetSci Dinner](http://sunelehmann.com/2013/03/12/netsci-2013-nine-reasons-to-go-reason-6/ "NetSci 2013: Nine reasons to go (Reason 6)")

NetSci 2013: Nine reasons to go (Reason 6)

Tue, 12 Mar 2013 13:32:14 GMT

In today's list of reasons that you should head over to the NetSci website [and submit an abstract right away](http://netsci2013.net/wordpress/call-for-papers/ "CFP") (**the deadline is this friday people!**), we change focus away from the academic reasons and turn our attention to the culinary reasons. **Reason 6: The NetSci dinner** Copenhagen is know for food. The world's best restaurant, [Noma](http://noma.dk "Noma"), is located in Copenhagen, and we even have [Thai restaurants](http://www.kiin.dk/welcome.html) here with Michelin stars. This year, the NetSci dinner reflects that fact. We're having the dinner in University of Copenhagen's (not too shabby) Banquet Hall, pictured below, [![festsal](/assets/20130312_netsci_2013_nine_reasons_to_go_reason_6/festsal_98a78460.jpg)](/assets/20130312_netsci_2013_nine_reasons_to_go_reason_6/festsal.jpg) and the dinner will feature uncompromising French cuisine created by the renowned Chef Francis Cardenau, from the [*Le Sommelier*](http://english.lesommelier.dk "Restaurant"). We're even in negotiation to have fine wines included in the dinner price. And don’t forget to check out the other reasons to visit NetSci 2013. - [Reason 1. The venue](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") - [Reason 2. The NetSci school](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)") - [Reason 3. Brand new types of talks](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)") - [Reason 4. Publisher session](http://sunelehmann.com/2013/03/08/netsci-2013-nine-reasons-to-go-reason-4/ "NetSci 2013: Nine reasons to go (Reason 4)") - [Reason 5. The superb list of satellite workshops](http://sunelehmann.com/2013/03/11/netsci-2013-nine-reasons-to-go-reason-5/ "NetSci 2013: Nine reasons to go (Reason 5)") Stay tuned for more good reasons.

NetSci 2013: Nine reasons to go (Reason 5)

Mon, 11 Mar 2013 19:28:40 GMT

Now that the weekend is over, it's back to the reasons I think you should head straight over to the NetSci 2013 website and [submit an abstract](http://netsci2013.net/wordpress/call-for-papers/ "CFP"). The ***call ends on Friday***, so there's no time to waste. ## Reason 5: Amazing list of satellite workshops This year, the number of satellite workshops has increased significantly, and I sincerely think we're in for a treat. With many angles on [temporal & dynamic networks](https://twitter.com/JariSaramaki/status/304135547801174016 "Temporal"), plus many other exciting topics, this is the place to be, if you're interested in where the field of network science is headed. And the satellites *are right next to the NetSci school*, so you can switch back and forth, if you want to attend a satellite workshop, but you just can't miss that tutorial on NetworkX. [![laika](/assets/20130311_netsci_2013_nine_reasons_to_go_reason_5/laika_641d0720.jpg)](/assets/20130311_netsci_2013_nine_reasons_to_go_reason_5/laika.jpg) You can find full info on the satellites [here](http://netsci2013.net/wordpress/satellites/ "Satellites"), but I've also reproduced the (impressive) list below. **Monday (June 3, 2013)** - **Networks over Time** Organizers: Rickard Coster, Niloy Ganguly, Balaraman Ravindran, Subramanian Shivashankar, Samarth Swarup - [**Networks of Networks: Systemic Risk and Infrastructural Interdependencies**](http://netonets.org/netonets-2013/) Organizers: Antonio Scala, Gregorio D’Agostino, Guido Caldarelli, Stefano Battiston - **Human Behaviour and Network Science** Organizers: Janos Kertesz, Rosario N. Mantegna - **Language and Network Science** Organizers: Gareth Baxter, Alexander Mehler - **Network Science in Education** Organizers: Stephen Uzzo, Hiroki Sayama, Catherine Cramer, Peter Pollner, H. Eugene Stanley, Paul Trunfio - **Network Medicine: Future of Disease Genomics and Healthcare** Organizers: Amitabh Sharma, Yang-Yu Liu, David Gomez-Cabrero - [**Temporal and Dynamic Networks: From Data to Models – Day 1**](http://tdn2013.wix.com/tdn2013) Organizers: Vittoria Colizza, Chiara Poletto, Alain Barrat, Bruno Gonçalves, Luis Rocha, Renaud Lambiotte, Jari Saramaki, Esteban Moro - **State-Topology Coevolution in Adaptive Networks** Organizers: Hiroki Sayama, Thilo Gross - **Multiple Network Modeling, Analysis and Mining** Organizers: Guido Caldarelli, Michele Coscia, Przemysław Kazienko, Matteo Magnani, Dino Pedreschi, Luca Rossi - **Dynamic Networks: Theory and Applications in Information and Communication Networks** Organizers: Maarten van Steen, Yamir Moreno, Frank Schweitzer **Tuesday (June 4, 2013)** - [**Networks of Networks: Systemic Risk and Infrastructural Interdependencies (continued)**](http://netonets.org/netonets-2013/) Organizers: Antonio Scala, Gregorio D’Agostino, Guido Caldarelli, Stefano Battiston - **[Arts, Humanities, and Complex Networks](http://artshumanities.netsci2013.net/)** Organizers: Maximilian Schich, Isabel Meirelles, Roger Malina - **Network Models in Cellular Regulation** Organizer: Erzsébet Ravasz Regan - **Complex Networks meets Machine Learning** Organizers: Morten Mørup, Mikkel N. Schmidt, Tue Herlau, Lars Kai Hansen - [**Temporal and Dynamic Networks: From Data to Models – Day 2**](http://tdn2013.wix.com/tdn2013) Organizers: Vittoria Colizza, Chiara Poletto, Alain Barrat, Bruno Gonçalves, Luis Rocha, Renaud Lambiotte, Jari Saramaki, Esteban Moro - **Social Dynamics** Organizers: Yves-Alexandre de Montjoye, Coco Krumme, Chaoming Song, Dashun Wang And don't forget to check out the other reasons to visit NetSci 2013. - [Reason 1. The venue](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") - [Reason 2. The NetSci school](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)") - [Reason 3. Brand new types of talks](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)") - [Reason 4. Publisher session](http://sunelehmann.com/2013/03/08/netsci-2013-nine-reasons-to-go-reason-4/ "NetSci 2013: Nine reasons to go (Reason 4)") Stay tuned for more reasons!

NetSci 2013: Nine reasons to go (Reason 4)

Fri, 08 Mar 2013 19:41:58 GMT

With only *one week left for submissions*, the list of reasons to visit NetSci continues. This time announcing an exciting event that's not even on the conference site yet. ## Reason 4: Publisher Session On Thursday night we're having a special publisher session. The idea is to give editors a chance to speak to authors about their journals - what they're looking for in terms content & scientific writing. [![PSKroeyer](/assets/20130308_netsci_2013_nine_reasons_to_go_reason_4/pskroeyer_5d18ccdb.jpg)](/assets/20130308_netsci_2013_nine_reasons_to_go_reason_4/pskroeyer.jpg) The event will take place on the evening of June 6th and will feature a mix of talks and panel discussion. We have secured (another) great location, this time at the [Royal Danish Academy of Sciences and Letters](http://www.royalacademy.dk "Royal Academy"). Guests & speakers include: - Joerg Heber, representing *Nature Communications.* - Hernan Rozenfeld, representing APS (*Physical Review Letters* and *Physical Review E*). - Ernesto Estrada, representing the new *Journal of Complex Networks* from Oxford University Press. - Alessandro Vespignani, representing *EPJ Data Science*. - We also hope to have a representative for *Network Science* form Cambridge University Press. We're still planning this event, so we may have surprise speakers as well. Note that seating for publisher session will be limited, so keep an eye on the NetSci page for up-to-date information on how to attend. Also don't forget to read about the first three reasons to visit NetSci 2013: - [Reason 1](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)"). - [Reason 2](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)"). - [Reason 3](http://sunelehmann.com/2013/03/07/netsci-2013-nine-reasons-to-go-reason-3/ "NetSci 2013: Nine reasons to go (Reason 3)"). Stay tuned for the rest of this series.

NetSci 2013: Nine reasons to go (Reason 3)

Thu, 07 Mar 2013 22:07:48 GMT

Wow. With all those superlatives in use for [reason 1](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") and [reason 2](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)"), I'm beginning to worry that the excitement of those posts might have the whole set of "Reasons to go to NetSci" posts peaking too soon. But we have so many good reasons to go - *and only a few days left before the [call for papers](http://netsci2013.net/wordpress/call-for-papers/) closes (**last day is March 15th**)*, so there's no turning back now. (If you think Jony Ive was laying it on think in all those apple ads wait till you've read all of these nine reasons.) ## Reason 3: Brand new types of contributed talks ***Ignite Session***: While the traditional parallel sessions are great, sometimes you just want a bigger audience ... to be heard by the entire NetSci audience. Well, this year we've added a new element, a session of [Ignite talks](http://igniteshow.com "Ignite"), which will allow many more of you to reach everyone at NetSci. The basic idea of the Ignite format is that presenters focus on the *central idea/result driving their research* using 20 slides that auto-advance every 15 seconds for a total of just five minutes. Getting through 20 slides in 15 minutes requires practice and creative thinking, but also results in quick, enlightening talks designed to draw out the essence of a topic. Finally, the Ignite event will be *open to the public*, so we expect a large audience with lots of visitors for Friday afternoon. ***Young researchers session***: In addition to the Erdös-Renyi Prize (read more about that [here](http://netsci2013.net/wordpress/erdos-renyi-prize/ "ER prize")), NetSci also has a prize to the best young (postdoc or graduate student) speaker. This year we'll pre-select six candidates for that prize who will present their work to the full NetSci audience in the main room, in a special 10 min format. We hope that this will allow the work of these promising young researchers to be seen by as many people as possible - and inspire the young researchers to give great talks. Dont forget to check out the other posts in this series - [Reason 1](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)") - [Reason 2](http://sunelehmann.com/2013/03/06/netsci-2013-nine-reasons-to-go-reason-2/ "NetSci 2013: Nine reasons to go (Reason 2)") And stay tuned for more good reasons.

NetSci 2013: Nine reasons to go (Reason 2)

Wed, 06 Mar 2013 11:03:41 GMT

Following up [on past promises](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)"), here's reason two that I think NetSci 2013 is going to excellent. ## Reason 2: The NetSci School This year, the NetSci school is going to be something special. So even if you're not yet a *bona fide* network science insider who's wise to [inside jokes about Zachary's Karate Club](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/#comment-655 "Zachary"), NetSci 2013 is going to be a great conference to visit - a great place to learn about networks and network analysis. *Great school lecturers*: First of all, the big man - godfather of netsci - [László Barabási](http://en.wikipedia.org/wiki/Albert-László_Barabási "ALB") himself will teach at the school, taking his amazing (and free) [new textbook](http://barabasilab.neu.edu/networksciencebook/ "NetSci Book") about network science on the road. Also teaching at the school: monsieur [Renaud Lambiotte](http://www.lambiotte.be "Renaud") and sensei [Naoki Masuda](http://www.stat.t.u-tokyo.ac.jp/~masuda/ "Masuda") who’ll be address the sizzling hot topics of multiplexity and dynamics in complex networks *Update your programming skills*: Secondly, we've included tutorials on some of the programming languages and visualization tools most important to network science. So even if you're a seasoned researcher, the school will have something for you. And to put it humbly, I think we've managed to get the best of the best: - [Aric Hagberg](http://math.lanl.gov/~hagberg/ "Hagberg") will talk about [NetworkX](http://networkx.github.com "NetworkX"). NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Besides being Deputy Director for the theoretical division of the Center for Nonlinear Studies at LANL, Aric is the primary developer of NetworkX, so we're in for a treat. - [Sébastien Heymann](http://www-complexnetworks.lip6.fr/~heymann/ "Heymann") will lead the tutorial on [Gephi](http://gephi.org "Gephi"), an open source interactive visualization and exploration software for complex networks. Sébastien is a leader of the [Gephi](http://gephi.org/) project and the administrator of the [Gephi Consortium](http://consortium.gephi.org/). He will include coverage of the newest version's ability to analyze and visualize evolving networks, so that one shouldn't be missed. - Finally, [Charles Pooh](http://www.linkedin.com/pub/charles-pooh/6/485/179 "Pooh") will address the new Network Science tools in [Wolfram Research](http://www.wolfram.com "Wolfram")'s [Mathematica](http://www.wolfram.com/mathematica/ "Mathematica"). Charles is is a senior kernel developer at Wolfram and in charge of the [graph theory package](http://www.wolfram.com/mathematica/new-in-9/social-network-analysis/ "Mathematica Graph Theory") (among other things) in Mathematica. Mathematica's tools for networks have received a major upgrade in the past few years, so Charles' tutorial is sure to blow a mind or two. I'm a huge fan of everything above, so I couldn't be more excited for everyone to experience the school. ### Reason 2.1: The school venue We'll have the school at DTU, the [Technical University of Denmark](http://www.dtu.dk/English.aspx "DTU"). in the Oticon Hall, see below. [![DTU+Oticonsalen+4](/assets/20130306_netsci_2013_nine_reasons_to_go_reason_2/dtuoticonsalen4_dafe1c03.jpg)](/assets/20130306_netsci_2013_nine_reasons_to_go_reason_2/dtuoticonsalen4.jpg) As you might be able to make out, that's another great venue - and has the added benefit of being *right next to the satellite workshops.* The fact that everything is in one place means that you'll be able to move back and forth between the school and satellites - and we hope that this will make the first two days of the conference will feel like a truly integrated part of the conference (with everyone being able to grab lunch in the same area, etc). So don't wait - go and [register](http://netsci2013.net/wordpress/registration/ "Register") right now. And don't forget to check out [part one of this series](http://sunelehmann.com/2013/03/05/netsci-2013-nine-reasons-to-go-reason-1/ "NetSci 2013: Nine reasons to go (Reason 1)").

NetSci 2013: Nine reasons to go (Reason 1)

Tue, 05 Mar 2013 09:30:45 GMT

Ok, so this June we're having the [NetSci conference](http://netsci2013.net/wordpress/ "NetSci2014") right here in Copenhagen. [Registration is now open](http://netsci2013.net/wordpress/registration/ "NetSci Registration") and the [call for paper closes on March 15th](http://netsci2013.net/wordpress/call-for-papers/ "NetSci CFP"). So in these last few days leading up to that deadline, I'm going to list some of what I think are the main reasons that you should dust off your danish phrase books, submit your most exciting work, and head right over to your favorite online travel agent to book a ticket to go to NetSci this year! ## Reason 1: The venue This year's venue is simply spectacular. It's the kind of place that makes you feel like a better person just for showing up. The architecture of the Danish Royal Library (where we're having the conference) is breathtaking, and you'll be surrounded by (to mention a couple of random examples) original handwritten Kierkegaard manuscripts and Bohr's notes and journals. To provide you all with a sense of how amazing this space is going to be, I've included a few photos: ![](http://dl.dropbox.com/u/153071/black_diamond/Black%20diamond.jpg) ![](http://farm1.static.flickr.com/3/6506528_5516161b36.jpg) ![](http://www.photosight.org/up/2008/10/26/92235.jpg) And the interior is spectacular as well: ![](http://dl.dropbox.com/u/153071/black_diamond/2679547688_2edb3152e6.jpg) ![](http://dl.dropbox.com/u/153071/black_diamond/81798.jpg) ![](http://dl.dropbox.com/u/153071/black_diamond/3973906292_ed5fbbd9ac.jpg) And the venue is, of course, just the beginning - we have many more pleasant surprises planned. Stay tuned for Reason 2. ### IMAGE CREDITS (IN ORDER OF APPEARANCE): - http://www.flickr.com/photos/snorri/158763944/ - http://www.flickr.com/photos/heilemann/6506528/ - http://www.photosight.org/photo.php?photoid=92235 - http://www.flickr.com/photos/16nine/2679547688/ - http://www.photosight.org/photo.php?photoid=81798 - http://www.flickr.com/photos/janlykke/3973906292/

Science and Art

Tue, 29 Jan 2013 10:32:15 GMT

Earlier this year, I was lucky enough to be awarded a Villum Foundation Young Investigator grant. A couple of days ago the Foundation's Annual Report came out [[dk link](http://veluxfondene.dk/C12576AB0041A865/0/F5D57E47E87E7D00C1257AF7001FF438/$file/VeluxAarsskrift2012-DK-WEB2.pdf), [en link](http://veluxfoundations.dk/C12576AB00426565/0/56E50ED58E1A9081C1257AF70021B07E/$file/VeluxAarsskrift2012-UK-WEB2.pdf). note: this is a 10mb download], with a very nice cover that features graphics designed by yours truly. [![villum_front_page](/assets/20130129_science_and_art/villum_front_page_0048705f.jpg)](/assets/20130129_science_and_art/villum_front_page.jpg) The cover art was created by the Danish artist [Smike Käszner](http://www.smike.dk) based on my graphics. Just for fun, I've included the original figures below. Here's the main network: [![net1](/assets/20130129_science_and_art/net1_2c01c031.png)](/assets/20130129_science_and_art/net1.png) As far as I can tell, the graphic that appears in Smike's artwork is a mirror image of the communication network at 20:00. I'm not 100 percent sure, but I think the background hairball was lifted from the figure below (even though I can't see the self-loop in the cover art anywhere in my network). [![net2](/assets/20130129_science_and_art/net2_07793a33.png)](/assets/20130129_science_and_art/net2.png) Inside the report (p 65), there's a nice feature on my project.

NetSci 2013 Call for Satellites

Wed, 10 Oct 2012 23:10:01 GMT

Slowly but surely [NetSci 2013](http://netsci2013.net/) is coming to life! We're now ready with the *official call for satellite workshops.* If you have an idea, don't hesitate to send it to Philipp - we're in the process of securing a great venue for the workshops, close to the NetSci school. Come to think of it, I think I might try to set up a workshop of my own! Keep an eye on the blog for more details. [![Image](/assets/20121010_netsci_2013_call_for_satellites/screen-shot-2012-10-10-at-11-05-48-pm_dc7c00b3.png)](/assets/20121010_netsci_2013_call_for_satellites/screen-shot-2012-10-10-at-11-05-48-pm.png)

Conrad Lee

Sun, 30 Sep 2012 11:58:13 GMT

Another great guest is visiting my group this week: [Conrad Lee](http://sociograph.blogspot.dk). Conrad has been writing consistently superb blog posts over at for quite a while now (I highly recommend checking out his back catalog, which contains insightful analysis of issues related to community detection in complex networks and much more). And he's interested in many of the same topics that I've worked on for years, so there should be lots of great discussions. ![](/assets/20120930_conrad_lee/photo.jpg "Conrad Lee") Tomorrow, he'll be speaking on methods for validating community detection algorithms using meta-data in the talk: ***[Are network communities good for nothing? Benchmarking algorithms with inference tasks](http://www.imm.dtu.dk/English/Research/ISP/News.aspx?guid=%7B6CEDF265-6E56-4A1F-8BDD-F86FD1502D30%7D).***With abstract hinting at a very interesting talk (and containing wildlife simile): > While community detection algorithms proliferate like rabbits in the spring, relatively little work has gone into determining which methods work best. In many cases, we know only that a given method can partition Zachary's Karate club - a problem which was solved over thirty years ago. Furthermore, the small literature concerned with benchmarking these algorithms focuses on synthetic data, leaving us with little evidence to support the claim that we can find meaningful communities in non-trivial, real-world social network data. We know so little about the performance of these algorithms because on the one hand we have a poor a priori intuition of how network communities are actually structured, and on the other hand we lack datasets that have a "ground truth" set of communities. > In this presentation, I argue that the quality of network communities can be evaluated by measuring how well they allow inference of missing information, such as certain node attributes and missing links. More concretely, good network communities should provide a machine learning model with informative features. I will discuss some conceptual and practical difficulties which came up when implementing a benchmark based on this premise using the Facebook100 dataset. Early results indicate that all tested methods have a bias for a particular scale, a finding which suggests that a scaling parameter is necessary. For example, modularity maximization and the Map Equation perform poorly, even when using the hierarchical versions of these methods. Their performance improved only when using their generalized formulations, which include a scaling parameter that alters the underlying objective function. I highly recommend stopping by if you're in the area! Time and place are listed [here](http://www.imm.dtu.dk/English/Research/ISP/News.aspx?guid=%7B6CEDF265-6E56-4A1F-8BDD-F86FD1502D30%7D).

Bruno Gonçalves

Sun, 23 Sep 2012 11:52:54 GMT

This week, we have another exciting guest, Bruno Gonçalves (twitter: @bgoncalves) will be visiting the lab Monday 24th of September, and Tuesday the 25th. Bruno has just moved to the university at Aix-Marseille University, from Alex Vespignani's group at Northeastern and we're excited to have him. ![](/assets/20120923_bruno_goncalves/Bruno2.jpg "Bruno Goncalves") Bruno is giving a talk *Monday at 11* - I highly recommend it: - **Title:** From Individual Activity to Collective Attention - Insights from Large Scale Social Network Analysis **Abstract:**Modern social systems such as Twitter expose digital traces of social discourse with an unprecedented degree of resolution of individual behaviors. They offer an opportunity to investigate both individual and collective behavioral patterns and to disentangle the temporal, spatial and topical aspects of human activity. A large survey of online exchanges or conversations on Twitter, collected across six months involving 1.7 million individuals is used to study how individuals manage their social relations. Two main features are observed: 1. Social interaction strength is highly dependent of the number of connections, corroborating Dunbar's Social Brain theory. A simple model shows how limited individual capacity for social interaction is enough to qualitatively reproduce the features observed. 2. Users display extremely diverse activity levels that follow a broad tailed distribution. We construct an activity driven model that is capable of encoding the instantaneous time description of social network dynamics. Within this framework, highly dynamical networks can be described analytically, providing a powerful tool for the analysis of social phenomena occurring over time-varying networks. Finally, we focus on Twitter activity surrounding American Idol voting as minimal and simplified version of complex societal phenomena such as political elections, and show that the volume of information available in online systems permits the real time gathering of quantitative indicators anticipating the future unfolding of opinion formation events. **Time & place:** Monday 24 September at 11:00-12:00 Building 305, Seminar room 053

Cambridge week

Tue, 28 Aug 2012 11:26:41 GMT

This week, we'll have two exciting visitors. [Sebastian Ahnert](http://www.tcm.phy.cam.ac.uk/~sea31/) will be visiting on Wednesday, speaking on [*Compressible components reveal network architectures*](http://www.imm.dtu.dk/English/Research/ISP/News.aspx?guid=%7BB98B53F5-8721-4394-9E9E-ABC926D789CA%7D). ![](http://blogs.telegraph.co.uk/culture/files/2011/04/cambridge_universi_1355627c.jpg "Cambridge") On Thursday we'll change gears and discuss mobility, modelling, and sensing. For that we're lucky to have [Cecilia Mascolo](http://www.cl.cam.ac.uk/~cm542/) and her postdoc [Neal Lathia](http://www.cl.cam.ac.uk/~nkl25/) visiting, speaking on [*How can sensor data be used to infer how people feel?*](http://www.imm.dtu.dk/English/News/Calendar_IMM.aspx?guid=%7BBA547423-D5B5-473C-9578-CE7ECB72EF44%7D) If you're anywhere near Copenhagen, I highly recommend both talks!

Cover Story

Tue, 10 Jul 2012 22:06:13 GMT

So catching up on recent developments, a big item from last month is that I made the cover of the local university paper [DTU avisen](http://www.dtu.dk/Nyheder/DTU_Avisen.aspx) - with nice photos and everything. [![](/assets/20120710_cover_story/cover.png "cover")](http://sunelehmann.com/2012/07/10/cover-story/cover/) The best version of the article is only available in Danish [find it [here](http://www.dtu.dk/Nyheder/Nyt_fra_DTU.aspx?guid={780E83C9-69A4-4547-9B8D-60A5CD05F85C})]. Due to a bit of last minute editorial changes, a bunch of errors made their way into the printed version [find it here [DK link](https://dl.dropbox.com/u/153071/dtu_avis_dk.pdf), [GB link (pdf)](https://dl.dropbox.com/u/153071/dtu_avis_gb.pdf), [GB link (html)](http://www.dtu.dk/English/About_DTU/News.aspx?guid=%7BCA3C2B32-3DEC-4116-B8C4-32DDB1DA7590%7D)]

Sune in Boston

Mon, 09 Jul 2012 22:44:40 GMT

Just a quick post to note that I'm in Boston and will be around for much of the summer (the rest of July and the first part of August). I arrived a bit over a week ago, but have been too busy to update the blog - I guess better late than never. In fact I have a substantial blog-backlog, so expect more activity on the blog over the next few weeks. [![](/assets/20120709_sune_in_boston/screen-shot-2012-07-09-at-4-45-46-pm.png "Boston")](http://sunelehmann.com/2012/07/09/sune-in-boston/screen-shot-2012-07-09-at-4-45-46-pm/) I'll be spending my time at Center for Complex Network Research and the LazerLab. Do send me an email if you're around and would like to meet up!

NetSci 2012: looking ahead

Mon, 25 Jun 2012 21:50:41 GMT

Just back from a great [NetSci conference](http://netsci2012.net/) at Northwestern University. Aside from being an excellent meeting with plenty of interesting talks and opportunities to meet with old friends, the conference was a reminder that NetSci 2013 is approaching rapidly. ![](http://www.jeremycwilson.com/wp-content/uploads/2011/03/kelloggCongrats.jpg "Kellogg") At the conference, [Petter Holme](http://www.tp.umu.se/~holme/) and I managed to recruit a great set of new committee members to help organize next year's conference. In addition to Petter and myself, the organizing team now counts - [Katharina Zweig](http://www.ninasnet.de/) as the technical chair, responsible for submissions. - [Jan O. Haerter](http://cmol.nbi.dk/people.php?id=48) is the chair of registration. - [Isabel Meirelles](http://isabelmeirelles.com/) has graciously promised to do a redesign of the NetSci visual identity. - [Philipp Hövel](http://www.nncn.uni-freiburg.de/nachrichten-en/philipphoevel/view?set_language=en) will be the chair of satellites (so let him know if you're interested in running a satellite event). - [Bruno Gonçalves](http://www.bgoncalves.com/) is going to be the web chair, responsible for the content of the netsci website. - In Copenhagen [Joachim Mathiesen](http://www.nbi.dk/~mathies/) is organizing all things local. - And finally, [Alan Mislove](http://www.ccs.neu.edu/home/amislove/) runs the NetSci web-infrastructure with a steady hand. With this amazing team, help from the *Network Science Society,* and a superb venue (see below), I'm sure that NetSci 2013 is well under way. But we could always use more help, so let me know if you have any ideas for how *you*can help make NetSci 2013 the best ever. We're also always looking for volunteers to help us run the conference itself. ![](http://universitetsavisen.dk/files/universitetsavisen.dk/imagecache/620x380/pictures/diamanten.JPG "Venue")

Yong-Yeol Ahn

Tue, 22 May 2012 11:04:05 GMT

This week, my [collaborator](http://www.nature.com/nature/journal/v466/n7307/full/nature09182.html)/friend from [Barabasilab](http://www.barabasilab.com/)/[award winning physicist](http://www.akpa.org/oyra-award.html), [Yong-Yeol Ahn](http://www.yongyeol.com/wp/about/) (better know as [YY](https://twitter.com/#!/yy)) is visiting the [Center for Social Data](https://socialdata.imm.dtu.dk/) here at DTU. If you're anywhere near Copenhagen, I highly recommend you stop by to see his talk! **Time**: Thursday, May 24th, 13:00 [[details here](http://www.imm.dtu.dk/English/Research/ISP/News/yyanh_talk.aspx)] **Title**: Community structure and flocking of memes in social networks **Abstract**: Spreading processes on networks (e.g. epidemic outbreak and information spreading) has been one of the most fundamental topics in network science. Information spreading in social networks has often been described by epidemic spreading models but recent studies demonstrated that some contagions (memes) exhibit fundamentally different pattern, where multiple exposure significantly enhances the transmission probability of the contagion. The co-operativity of a contagion makes the spreading process more sensitive to clustered network structure. Here we investigate the relationship between network communities and spreading of hashtags in a Twitter network. ![](/assets/20120522_yong_yeol_ahn/yy_2011.jpg "YY") And if we're lucky, maybe we can even talk him into speaking a bit about his interesting work on food pairings and molecular gastronomy!

MAPCON12

Tue, 15 May 2012 14:25:30 GMT

This week I'm visiting the [Max Planck Institute for the Physics of Complex Systems](http://www.mpipks-dresden.mpg.de/ "mpi pks") in Dresden, participating in the workshop *[Mathematical Physics of Complex Networks: From Graph Theory to Biological Physics](http://www.pks.mpg.de/~mapcon12/).* [![](/assets/20120515_mapcon12/dresden.jpeg "dresden")](http://sunelehmann.com/2012/05/15/mapcon12/dresden/) It's quite the honor to be invited to speak at a conference full of [real physicists & bona fide graph theorists](http://www.pks.mpg.de/~mapcon12/Teilnehmerliste_Web_100512.pdf) (although it's putting my softening brain hard at work: there seems to be a lot more analytical results here than I'm used to these days).

CompleNet 2012

Tue, 06 Mar 2012 07:39:45 GMT

This week I'm heading out to [CompleNet 2012](http://2012.complenet.org/) in [Melbourne Florida](http://maps.google.com/maps?q=melbourne+florida&um=1&ie=UTF-8&hq=&hnear=0x88de0e2c4771994d:0x8bcdb254a90cd2a8,Melbourne,+FL&ei=xrBVT9ydB8zMtAbdzISCBw&sa=X&oi=geocode_result&ct=title&resnum=3&ved=0CFsQ8gEwAg). It looks like a great conference with [Laszlo Barabasi](http://www.barabasi.com/), Robert Bonneau & [Sinan Aral](http://web.mit.edu/sinana/www/) headlining. The conference will also feature an unreasonably high level of activity by yours truly with duties including an invited talk (8:40 on March 8th), chairing a technical session (on network metrics and models, 10:20 on March 7th), as well as a brief talk at the opening of the art exhibition on *The Art of Networks* at the local [Foosaner Art Museum](http://foosanerartmuseum.org/), about the creation of the [TwitterMood](http://www.ccs.neu.edu/home/amislove/twittermood/) visualization. ![](http://www.golfpacspacecoast.com/graphics/lodging/CB-CRP-01-360x260.JPG "Crown Plaza") Hope to see you there if you're in or around the Sunshine State!

Renaud Lambiotte

Thu, 01 Mar 2012 08:40:55 GMT

[Renaud Lambiotte](http://www.lambiotte.be/) is visiting for a few days. He's an exciting guy whose work focuses on the relation between dynamics, function and structure in complex systems, with a focus on neuronal and social networks (check [his website](http://www.lambiotte.be/) for more details). He's an associate professor in Mathematics at the University of Namur (Belgium). If you're in the Copenhagen area, I highly recommend going to [his talk](http://www.imm.dtu.dk/English/Research/ISP/News/renaud_talk_02032012.aspx) this friday. Here are the details. **Title: Random Walks on Networks: Dynamics and Teleportation** **Abstract:**In this talk, I will focus on two problems related to random walks on networks. First, I will focus on random teleportation, which is a necessary evil for ranking and clustering directed networks based on random walks. Teleportation enables ergodic solutions, but the solutions must necessarily depend on the exact implementation and parametrization of the teleportation. For example, in the commonly used PageRank algorithm, the teleportation rate must trade off a heavily biased solution with a uniform solution. Here we show that teleportation to links rather than nodes enables a much smoother trade-off and effectively more robust results, and discuss the effect of teleportation on clustering. In the second part of my talk, I will focus on random walks on temporal networks, i.e. networks evolving in time. In particular, I will examine the effects of inter-event statistics on the dynamics of edges, and apply the concept of a generalized master equation to the study of continuous-time random walks on networks. **Time & place:** - Friday 2 March 2012 at 14:00 - Technical University of Denmark - Seminar room 053, Building 305 **Everyone is welcome!**

Not a bad couple of months

Wed, 08 Feb 2012 11:42:43 GMT

Three pieces of excellent news since I last posted here: On December 13th, I received The Jorck Foundation's research prize, which is awarded to three young researchers annually (as is evident from the photo, "young-ish researchers" might be more appropriate). ![](http://dl.dropbox.com/u/153071/jorcks_fond-2011_lille.jpg "Jorck's ") I've stolen the image from the [DTU announcement](http://www.dtu.dk/Nyheder/Nyt_fra_DTU.aspx?guid={F01668CE-03AC-483E-A726-19509D2A9562}) (in Danish), which lists all names, etc. The gentleman on my right hand side is the Danish [supreme court president](http://en.wikipedia.org/wiki/Supreme_Court_of_Denmark), Børge Dahl, who is also the director the Jorck Foundation's board. Justice Dahl awarded the prize in person inside the actual supreme court, at [Christiansborg Palace](http://en.wikipedia.org/wiki/Christiansborg_Palace) - the proceedings were very old-worldly (is that a word) and fancy. Then, on January 23rd, I was officially awarded millions in a very nice research grant as a part of the [Villum Foundation](http://veluxfoundations.dk/)'s *[Young Investigator Program](http://villumfoundation.dk/Fonden/UK/Vilfon/Content.nsf/f88c9976dcb9a800c1256c67004b56ab/fa3c4a44a1b36a83c125797b00466c5c?OpenDocument)*. More information on the content of the proposal can be found [here](http://www.imm.dtu.dk/Nyheder/Nyheder_IMM.aspx?guid=%7B2318BB7A-777E-4CAB-A1EA-1BEBF61EDB69%7D) (official DTU page). [![](/assets/20120208_not_a_bad_couple_of_months/yip.jpg "yip")](http://sunelehmann.com/2012/02/08/good-month/yip/) In fact, I'm currently looking for a new PhD student (fully funded) to work on this project; should you be interested, you can read the details [here](http://www.dtu.dk/Om_DTU/ledige_stillinger.aspx?guid=dbad8321-1804-41d0-99d0-534c31dc3de1). Finally, effective February 1st, I am an [Associate Professor](http://www.imm.dtu.dk/English/Service/Phonebook.aspx?lg=showcommon&type=person&id=25317) of social informatics, still at DTU Informatics -- and this might be the very best piece of news in a great couple of months.

CCCSS Workshop

Sun, 04 Dec 2011 21:06:12 GMT

The program for our workshop on December 15th is finally available for your reading pleasure. We'll be talking about how to design a great sociometer experiment - and what the most exciting research questions are. *Note that the workshop is **open to the public**, so if you're in (or near) Copenhagen, do stop by!* We do have limited seating, so please send an email to David ([ddl@econ.ku.dk](mailto:ddl@econ.ku.dk)) by December 12th, if you plan on attending. # Workshop: The Copenhagen Center for Computational Social Science Inaugural Workshop. ***December 15th, 2011.*** Organized by [Anders Blok](http://www.soc.ku.dk/english/staff/researchers/?id=182002&f=3&vis=medarbejder), [Søren Kyllingsbæk](http://cvc.psy.ku.dk/sk/), [David Dreyer Lassen](http://www.econ.ku.dk/Faculty_And_Staff/showID.asp?profile_id=817), [Morten Axel Pedersen](http://anthropology.ku.dk/staff/beskrivelse/?id=255694&f=3) and yours truly. ## Abstract: At last year's Techonomy Conference, former Google CEO, Eric Schmidt, noted: "There was 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days, and the pace is increasing". This massive increase in the rate of data generation has opened up new possibilities for computational investigations of human behavior. We - a multi-disciplinary team of scholars from the Faculty of Social Sciences at the University of Copenhagen and the Technical University of Denmark - are interested in taking advantage of the recent technological developments in order to push the current boundaries of quantitatively based understandings of social systems. Specifically, the aim of our proposed research program CCCSS (Copenhagen Center for Computational Social Science) is to record the network of social interactions with very high resolution (both in terms of temporal sampling and number of recorded communication channels) by using smart phones as sensors for sampling a variety of communication channels, e.g. face-to-face via Bluetooth, geolocation via GPS, social network data (Facebook, Twitter) via apps, and telecommunication data via call logs. Based on this highly complex and dynamic network, we want to develop computational (mathematical) approaches to describe the underlying social system. In addition to this overall goal, we are interested in a five concrete themes, which will support and inform our efforts to formulate a general theoretical framework spanning across different scientific disciplines: 1. *Incomplete data and sampling.*The significance of having access to only a small fraction of the full data in a networked system is poorly understood at present. We will use our findings from this high-resolution sample as a tool to understand much larger `low resolution' data sets describing millions of individuals and billions of interactions. 2. *Information stored in relationships.*We know, in a casual sense, that it is possible to learn about a person by the company she keeps. We show that we can quantify this notion in a social network and we study to what extent our behavioral patterns are encoded in our social relations. 3. *Influence in social systems.*We wish to study how influence spreads in social systems, which is a problematic issue in most datasets. Our experimental setup allows us to probe causal issues by running controlled interventions; we will be able to run field experiments to test our hypotheses. 4. *Methodological experiments and their epistemological effects.*For a long time, social scientific methods have been split according to a qualitative/quantitative divide. Based on our experiment, we want to explore how new high-resolution datasets may shift the terms of this debate. As part of this effort, we also wish to investigate what the increasing use of digital setups in social network analysis means for the nature of the (social) scientific experiment 5. *Privacy and ethics in social network research.*We explore the question of privacy and develop novel strategies to ensure that our research (and the research of others working on similar topics) does not violate individual and collective rights to privacy. ## Program: Thursday, December 15th, 2011: - 9.30 Coffee - 10.00 Sune Lehmann: Introduction - 11.00 [Martin Raubal](http://raubal.cartography.ch/): Socially informed location-based knowledge discovery - 12.00 Lunch - 13.00 [Daniele Quercia](http://www.cl.cam.ac.uk/~dq209/): Personality and Language in Social Media - 14.00 Tea - 14.30 [Alan Mislove](http://www.ccs.neu.edu/home/amislove/): Privacy in Online Social Networks - 15.30 [Matt Candea](http://www.candea.net/Home.html): The quantity and quality of gaps: On the value of not knowing certain things Note that we'll follow the format 30 min. + discussion for all talks ## Venue: The Seminar Room (2nd floor, CSS 26.2.21) Department of Economics Building 26, Centre of Health and Society (CSS) Øster Farimagsgade 5 () 1353 København, Denmark

Google's generosity goes to zero!

Fri, 07 Oct 2011 08:51:17 GMT

According to the best of my calculations, the growth of Gmail storage is linear in time. Today, I recorded the amount of storage at two different times and found the rate of storage growth to be about 4.06541 bytes per second. This is consistent with [Wikipedia's report that, as of Jan 18th](http://en.wikipedia.org/wiki/Gmail#cite_ref-17), 2010 Gmail's storage was increasing at a rate of approximately 0.000004 MB per second. In other words, Google is giving away space at a constant rate. Now, since the price of hard drive storage space seems to drop exponentially ([over the last 30 years, space per unit cost has doubled roughly every 14 months (increasing by an order of magnitude every 48 months](http://www.mkomo.com/cost-per-gigabyte "Hard drive prices")), this implies that Google is paying exponentially less for their new hard drive space [1]. The only reasonable conclusion is that***Google's generosity is rapidly approaching zero!*** Just to be extra silly, I actually plugged the growth-data from my own account and used the regression fit from the site above in order to estimate the cost per gmail account as a funtion of time. [![](/assets/20111007_google_s_generosity_goes_to_zero/account_cost.png "account_cost")](http://sunelehmann.com/2011/10/07/googles-generosity-over-time/account_cost/) Full disclosure: There are a number of problems with the approach of estimating the cost of an account as *current storage* multiplied by *current cost of storage.* And let me just mention some of them here for transparency. Firstly, my storage price is based on consumer hardware prices, and I'm betting that Google probably can probably get some kind of bulk deal. Secondly, I assume that Google has some kind of backup system in place, which increases the need for storage beyond the account size reported by Google. Finally and most importantly, the correct price for storage over time should probably be estimated as accumulated price paid for hardware at time *t* compared with the total amount of storage offered for free at time *t*. And there's one final problem with the linear growth of storage. The issue becomes extra noticeable because all this cheap storage also applies to our personal computers ... and to the average attachment size, which is probably growing in proportion to the size of the hard drive it was sent from [2]. What this means is that we're likely to use up Gmail storage space at a rapid increasing rate. I'm not saying that this is a violation of the "don't be evil" maxim. It's just that I'm running out of inbox space and don't want to pay for additional storage. ## Footnotes [1] See also http://ns1758.ca/winch/winchest.html for more info on historical hard drive pricing. [2] I don't really have data to support this claim, but it sounds reasonable to me.

Feynman on Beauty ... again

Thu, 06 Oct 2011 15:14:33 GMT

A while ago I [wrote about beauty in Nature](http://sunelehmann.com/2011/03/20/whitman/) (seen from the perspective of a physicist), and posted a video with Richard Feynman's insightful reflections on the topic. I just came across another version of that answer, which is incredibly beautiful. Check it out below:

Note. I found the video on [kottke.org](http://kottke.org/11/10/richard-feynman-on-beauty).

On 'Frictionless Sharing'

Mon, 03 Oct 2011 22:44:02 GMT

If you like sharing everything and if you think that pressing a 'like'-button is too much work, you're going to love Facebook's new frictionless sharing. If you like to steal a private moment once in a while and sometimes try to pretend to be cooler than you are, you might not like it so much. Ok, first, let's recap the basic idea behind the frictionless sharing: If one of Facebook's '[social plugins](http://www.facebook.com/help?page=1068)' is installed on a site you're visiting you're *automatically sending anything you read into your Facebook news feed*. And the only clicking you'll have to do is the actual clicking through to the article. Oh, and the final piece of good news is [that you don't even have to be logged into Facebook](http://nikcub.appspot.com/logging-out-of-facebook-is-not-enough) for the social plugins to work [*update*, seems like that issue[has actually been fixed](http://nikcub.appspot.com/facebook-fixes-logout-issue-explains-cookies)]. One reason I think this might end badly is that an essential part of the Facebook experience is the pleasure of carefully creating a gently improved online/external version of who you are. I'm not sure people are going to like when that aspect is slowly eroded away. My favorite example for when this external persona comes into being is when you explain to people what kind of music you like. It's nearly impossible (for me at least) not to bring up the coolest music that you listen to, rather than the music you *like* the most. For example, you might mention Animal Collective's *Merriweather Post Pavilion*as your favorite album [1], rather than point out that a mix including Bangles' *[Eternal Flame](http://www.youtube.com/watch?v=y_iISmExEGo&feature=related)*, Christina Aguilera's *[Beautiful](http://www.youtube.com/watch?v=eAfyFTzZDMM)* and Bon Jovi's *[Bad Medicine](http://www.youtube.com/watch?v=eOUtsybozjg)* has an a power-law tail worthy play count in your iTunes player. Not being allowed to construct an idealized version of yourself is a bit like being forced to always wear t-shirt, jeans, and flip flops. However, sharing everything has other downsides, the most important of which is that 'oversharing' might rob us of the ability to steal a moment once in a while. Let me try and explain why that might be a problem: A recent post on kottke.org has the title "[Why is Sergei Brin so good at angry birds](http://kottke.org/11/09/why-is-sergey-brin-so-good-at-angry-birds)". Kottke writes: I spent perhaps too much time this morning pondering one of the mysteries of the internet: Sergey Brin's astronomically high scores on the Google+ version of Angry Birds. For instance, [Brin's high score on the easiest level of the game is 36240](https://plus.google.com/109813896768294978296/posts/6ScjwbLJLhF). It's a legit score ([here's a higher one](https://plus.google.com/107675566661349240635/posts/ivobZtHH9x5)) and he has impressive scores on several other levels. It's a neat observation [2], but the crucial point of the story is that everyone is left to wonders: 'Why is Brin spending his time playing Angry Birds, when he should be at work running Google?' And most of us aren't even shareholders. Or the other day, on my way home from work, I noticed that the fall sunlight was particularly golden -- and on a whim, I took a small detour to enjoy a couple of additional minutes outside before returning home to help with tired/moody toddler care (including diaper changes) and other post-work chores. Without frictionless sharing, I can still get away with stuff like that, but I'm wondering what my wife would have thought if Facebook had posted something like 'Sune took a detour in the sun today', while she was at home working hard to rein in a tired 1.3 year old. Now, I could (and would) certainly argue that stealing a moment was a good idea - that a couple of minutes of unplanned meandering once in a while is what keeps me (and, I think, other people ... for example Sergei Brin) sane in an increasingly busy world[.](http://www.youtube.com/watch?v=1e3mLmFScB8) And I'm also pretty sure that I could have convinced my wife that that detour was not a waste of time. The problem is that *having to explain* that moment would have kind of ruined it. So if had known that my stolen moment had been actively shared by Facebook, I probably would have gone straight home. And that's the problem: It's not that you can't still steal a moment with frictionless sharing. It is the fact that you might have to justify each one that might ruin those moments; perhaps even make you decide not to steal any more moments. And that seems to me like something almost worse than a simple invasion of privacy. Let me know what you think in the comments! ### Notes [1] Ok, so that's probably not a hip album anymore, but I'm much to busy to be a hip these days [2] Also note that Kottke is *making excuses for stealing a moment* to ponder silly stuff like Sergei Brin's Angry Bird's score.

More on TweetQuakes

Sun, 04 Sep 2011 16:17:07 GMT

A few days ago, I wrote (with Alan Mislove) about our TweetQuake visualization (read the relevant post [here](http://www.iq.harvard.edu/blog/netgov/2011/08/tweetquake.html)). Some of the commenters pointed out that it's not really surprising that tweets travel faster than earthquakes. Here's Andrew Gelman (I don't know if it's [that famous Andrew Gelman](http://andrewgelman.com/), but I think so) commenting on [The Monkey Cage Blog](http://themonkeycage.org/blog/2011/08/25/twitter-is-faster-than-an-earthquake/): [![](/assets/20110904_more_on_tweetquakes/screen-shot-2011-09-01-at-11-05-37-pm.png "Screen Shot 2011-09-01 at 11.05.37 PM")](http://sunelehmann.com/2011/09/04/more-on-tweetquakes/screen-shot-2011-09-01-at-11-05-37-pm/)And he's right. Information traveling via optical fiber is about as fast as anything you can find in the universe (and as Gelman points out, other important examples of rapid communication technology includes telephone/radio communication). This much was even clear to yours truly when I read the xkcd comic no. 723 back in April of 2010. I [tweeted](http://twitter.com/#!/suneman/status/11641266636): i guess it's somewhat trivial, but nonetheless - it seemed profund when i read it: tweets are faster than earthquakes So why did it seem profound when I read the comic? Why is it still interesting that Twitter is faster than an earthquake? The fact that the news of the earthquake on twitter spreads faster geographically than the earthquake itself ***is*** something non-trivial and profound. And I think I can explain why. Until now, we've categorized earthquakes among events happen so quickly that they're *instantaneous* for all intents and purposes. An event that propagates between 6 700 and 11 200 miles/hour is incredibly fast. So the surprise is not that electronic signals are fast, *but that a news medium (i.e. Twitter/Facebook) can deliver news faster than things that used to be instantaneous*. That is what is new (and kind of awesome)! But not that awesome - because even though you know the earthquake is coming before it hits, there's still not really time to react properly to the threat; the earthquake will still be there in a few seconds time. And the Twitter advertisement team picked up on just this fact in their most recent advertisement, embedded below.

The message is clear: You do get the news about the quake arriving, but it doesn't really change anything. But let's dig a little deeper. Last year, when we created the twitter *Pulse of the Nation* visualization (check it out [here](http://www.ccs.neu.edu/home/amislove/twittermood/) if you haven't seen it), I came up with a highly speculative (and self-important) analogy that I love to talk about. The general idea is that even though the importance of individual tweets is highly variable, something interesting begins to happen when we look at thousands, millions, or even billions of them. I wrote: In analogy to individual neurons firing together to add up to the human consciousness, the billions of tweets have meaningful macro-states that contain information about the whole system rather than the individual tweeters. But we need to do a little data mining to extract meaningful information about these states, to expose our collective states of mind. [quoted from [here](http://www.iq.harvard.edu/blog/netgov/2010/07/mood_twitter_and_the_new_shape.html)] Now, I think the earthquake visualization can be thought of as a a manifestation of the same kind of phenomenon. If the twitterverse is to be taken seriously as some kind of global-scale nervous system, the earthquake response is not something like the state-of-mind or consciousness that I claimed the mood was. The earthquake response is something closer to that ultra fast reflex that kicks in right before you're unavoidably punched in the face. Like the guy in the movie below at around 16 seconds in. Notice him closing his eyes and clenching his facial muscles tightly in anticipation:

He knows something uncomfortable is coming, but has to hang tight and hope that it's not too tough. And that's the type of edge that twitter has given us with respect to the earthquake. Let me know what you think in the comments!

TweetQuake

Thu, 25 Aug 2011 02:12:32 GMT

*This is a joint post with [Alan Mislove](http://www.ccs.neu.edu/home/amislove/), based on our work with [Yong-Yeol Ahn](http://yongyeol.com/) and Chloe Kliman-Silver.* On on August 23, 2011, at 1:51 PM EDT a magnitude 5.8 earthquake hit the Piedmont region of the U.S. state of Virginia. Orders of magnitude smaller than the recent earthquake in Japan, this quake was nonetheless the largest in the U.S. east of the Rocky Mountains in 114 years (according to [Wikipedia](http://en.wikipedia.org/wiki/2011_Virginia_earthquake)). But why are we talking about earthquakes? We should be talking about people talking about earthquakes. And people really did some talking. The official twitter account (@twitter) posted three back-to-back tweets on the subject: > Are Tweets faster than seismic waves? We can't speak to speed of seismic waves, but a Tweet can reach your followers in less than a second. [[link](https://twitter.com/#!/twitter/status/106117804540956672)] > Within a minute of today's #earthquake, there were more than 40,000 earthquake-related Tweets. [[link](https://twitter.com/#!/twitter/status/106145327760936960)] > And, we hit about 5,500 Tweets per second (TPS). For context, this TPS is more than Osama Bin Laden's death & on par w/ the Japanese quake. [[link](https://twitter.com/#!/twitter/status/106145363559321600)] Now, as I am sure many people have already pointed out (e.g. [on twitter](http://twitter.com/#!/search/xkcd.com%2F723%2F)), this situation was deftly analyzed and anticipated by Randall Munroe, author of the wonderful webcomic ***xkcd*** back in April 2010. Here's the strip: ![seismic_waves](/assets/20110825_tweetquake/seismic_waves.png) As Munroe points out, the speed of "damaging" seismic waves is around 3-5 km/second, which is much slower than the speed of information spreading on the internet. This simple fact means that if you're more than 100 km away from the epicenter you can read about the quake on twitter before it hits you. Now, combine idea from the ***xkcd*** strip with data from the tweetquake and it's *possible to observe this phenomenon in practice*. In the visualization below, we've generated a video of the mentions of the work "earthquake" in tweets from the gardenhose in the 5 minutes immediately following the earthquake. For simplicity, we have assumed a uniform 4 km/s wave and ignored deformations due to map projections, etc (we're not geologists, after all).

The comic strip doesn't factor in the time it takes to actually write a tweet, and since seconds count, it takes more than 100 km before we see tweets posted outside the wavefront (validating the last frame of the comic strip). It is awe inspiring to see a *truly* real time news medium in action. **Notes**: - The video looks a lot better in high definition on [YouTube](http://www.youtube.com/XJ1EQbmJ_LQ). - See also [this tweet](http://twitter.com/#!/suneman/status/11641266636) from April 2010. - This content was first posted at the Complexity and [Social Networks Blog](http://www.iq.harvard.edu/blog/netgov/2011/08/tweetquake.html "Complexity and Social Networks").

Link communities R package

Fri, 24 Jun 2011 12:48:28 GMT

A while ago, I [wrote](http://sunelehmann.com/2010/11/03/visualizing-link-communities/) about Rob Spencer over at [Scaled Innovation](http://scaledinnovation.com/)'s implementation of the algorithm for detecting link communities. Today, I am happy to report on another exciting development for the alorithm. Alex Kalinka from the [Tomancak lab](http://www.mpi-cbg.de/research/research-groups/pavel-tomancak.html "mpi") at the Max Plank Institute (MPI-CBG) has written a great implementation in *R*, called *linkcomm*. It is now up on CRAN: While everything is excellent, the graphics are particularly beautiful - much prettier than our own visualizations - check out the colored link dendrogram plot (from the [CRAN website](http://cran.r-project.org/web/packages/linkcomm/vignettes/linkcomm.pdf "image source")) [![](/assets/20110624_link_communities_r_package/screen-shot-2011-06-22-at-8-24-53-pm.png "Link dendrogram")](http://sunelehmann.com/2011/06/24/link-communities-r-package/screen-shot-2011-06-22-at-8-24-53-pm/) And the spatial network layout options are great as well; the various community visualizations are simple, elegant, and very pretty: ![](/assets/20110624_link_communities_r_package/screen-shot-2011-06-24-at-6-20-06-pm.png) In addition, there are many neat features. For example, *linkcomm* allows you to visualize sub-communities by themselves. Alex has also published an Application Note in Bioinformatics about the implementation, so take a look if you're interested: (open access). We also link to the package from our [link communities download page](http://barabasilab.neu.edu/projects/linkcommunities/).

Tu Vuò Fà L'Americano

Tue, 21 Jun 2011 10:55:27 GMT

I'm excited to leave Boston for a bit to participate in [ARS'11](http://www.ars2011.unisa.it/uploads/2825/homears1305.gif "Ars Website"): The Third International Workshop on Social Network Analysis, Collaboration Networks and Knowledge diffusion: Theory, Data and Methods. It takes place in Naples, Italy this week, and the speaker line-up looks exciting (despite the fact that they invited me) [1]. Here's a bit of text from the official description: > **ARS'11 International Workshop is a follow up to two very successful previous editions ( [ARS'07](http://www.unisa.it/Centri_e_Vari/statlab/ars2007/index.php "ARS'07")and [ARS'09](http://www.unisa.it//Centri_e_Vari/statlab/ars2009/english/index.php?cms=yes "ARS'09")) and will be held on June 23-25, 2011 in Naples (Italy).** > Collaboration networks attract a lot of attention in many fields and are considered a key element in the advancement and dissemination of knowledge in scientific as well as in socio-economic domains. The workshop has the objective of presenting the most relevant results and recent developments in the areas of Collaboration Networks, Innovation Networks and Knowledge Diffusion. > The workshop also aims to deepen existing scientific cooperation between Social network analysts, to establish new cooperation between researchers, and to provide a forum for exchange of ideas among them. > The workshop topics include: > > - Collaboration theory > - Analysis of innovation networks in economics environments > - Sources of collaboration data > - Social Network Analysis methods for collaboration data ## Notes: [1] I stole the idea for this elegant, faux self deprecating plug from [Aaron Clauset’s blog](http://www.cs.unm.edu/~aaron/blog/archives/2011/01/conference_appl.htm).

Back in the USA

Mon, 30 May 2011 10:01:40 GMT

I'm delighted to report that I'm back in Boston for the summer. The next couple of days (May 31st and June 1st), I'll be attending the [Interdisciplinary Workshop on Information and Decision in Social Networks](http://wids.lids.mit.edu/), which looks to be really exciting. And for the next couple of months I'll primarily be at the [Center for Complex Network Research](http://www.barabasilab.com/) at Northeastern University. So do look me up if you're in town.

Whitman

Sun, 20 Mar 2011 09:22:52 GMT

I recently came across the following Whitman poem: ### When I Heard the Learn’d Astronomer > When I heard the learn’d astronomer, > When the proofs, the figures, were ranged in columns before me, > When I was shown the charts and diagrams, to add, divide, and measure them, > When I sitting heard the astronomer where he lectured with much applause in the lecture-room, > How soon unaccountable I became tired and sick, > Till rising and gliding out I wander’d off by myself, > In the mystical moist night-air, and from time to time, > Look’d up in perfect silence at the stars. This poem beautifully captures the feeling that when you quantitatively analyze something (be it Nature or literature), it often feels like some of the initial beauty and magic of the phenomenon disappears [1]. As a scientist, the position that a scientific viewpoint somehow diminishes 'beauty and magic', is something you run into once in a while, so it's good to have an answer. My own reply is that while it's true that analysis tends to strip many phenomena of some kind of immediate (and often trivial) appeal, digging deeper almost always reveals new layers of beauty. I had developed some examples to go along with this argument, based on my own experiences, but a couple of years ago, I watched an interview with Richard Feynman [2], and his answer is so much better than mine that I'll leave the rebuttal of Whitman to him: http://www.youtube.com/watch?v=ZbFM3rn4ldo ### Postscript After writing the above, I googled the poem - I guess I should have done that before writing - and found a lot of fun/interesting discussions. One commenter pointed to a modern version of Whitman's standpoint courtesy of the [Insane Clown Posse](http://knowyourmeme.com/memes/fcking-magnets-how-do-they-work) (from *Miracles,* 2009): > Water, fire, air and dirt > Fucking magnets, how do they work? > And I don’t wanna talk to a scientist > Y’all motherfuckers lying, and getting me pissed. Check out the pages below for more. Particularly the comment thread for the first post is a treasure trove: - - [http://www.scottaaronson.com/writings/whitman.html](http://languagelog.ldc.upenn.edu/nll/?p=2371) ### References [1] My own favorite example is that - when conditions are good - there are [9110](http://answers.google.com/answers/threadview/id/742414.html) stars visible to unaided human eye. I'm pretty sure that bringing up this factoid could ruin a romantic evening under the stars. Anyway, I'm rambling. [2] From the BBC program [Horizon](http://www.bbc.co.uk/sn/tvradio/programmes/horizon/broadband/archive/feynman/). Interview recorded in 1981 - the whole thing is highly recommended.

NetSci 2013: Venue and Dates

Thu, 24 Feb 2011 08:54:53 GMT

It's time to get out your pencils and mark your 2013 calendars: *NetSci 2013 will take place June 3rd - 7th at the new [The Royal Library](http://en.wikipedia.org/wiki/Royal_Danish_Library) (the [Black Diamond)](http://en.wikipedia.org/wiki/Black_Diamond_(library)) in [Copenhagen, Denmark](http://goo.gl/maps/vHoC).* Along with fellow organizing committee members [Petter Holme](http://www.tp.umu.se/~holme/), [Joachim Mathiesen](http://www.nbi.dk/~mathies/), and [Alan Mislove](http://www.ccs.neu.edu/home/amislove/), I'm excited to announce that we've secured an incredible venue for NetSci 2013. In order to provide non-Copenhageners with a sense of how amazing this space is going to be, I've included a few photos: ![](http://dl.dropbox.com/u/153071/black_diamond/Black%20diamond.jpg) ![](http://farm1.static.flickr.com/3/6506528_5516161b36.jpg) ![](http://www.photosight.org/up/2008/10/26/92235.jpg) And the interior is spectacular as well: ![](http://dl.dropbox.com/u/153071/black_diamond/2679547688_2edb3152e6.jpg) ![](http://dl.dropbox.com/u/153071/black_diamond/81798.jpg) ![](http://dl.dropbox.com/u/153071/black_diamond/3973906292_ed5fbbd9ac.jpg) And the venue is, of course, just the beginning - we have many more pleasant surprises planned for NetSci 2013. Stay tuned for updates. ### Image credits (in order of appearance): - http://www.flickr.com/photos/snorri/158763944/ - http://www.flickr.com/photos/heilemann/6506528/ - http://www.photosight.org/photo.php?photoid=92235 - http://www.flickr.com/photos/16nine/2679547688/ - http://www.photosight.org/photo.php?photoid=81798 - http://www.flickr.com/photos/janlykke/3973906292/

Conference: Applications of Network Theory

Tue, 22 Feb 2011 01:53:41 GMT

Just a quick advertisement for an exciting European conference co-organized by my fellow NetSci 2013 organizer [Petter Holme](http://www.tp.umu.se/~holme/). It takes place in Stockholm, Sweden in early April. The speaker line-up looks pretty good, despite the fact that they invited me [1]. ### [Conference on Applications of Network Theory](http://www.nordita.org/nwkconf2011/) *Date & Location*: 7 - 9 April 2011 at [AlbaNova](http://www.albanova.se/) in Stockholm Sweden *Organizers*: Peter Minnhagen (Umeå) and Petter Holme (Umeå) *Invited speakers*: Lada Adamic, University of Michigan Albert-Laszlo Barabási, Northeastern University Jordi Bascompte, Consejo Superior de Investigaciones Cientificas Sebastian Bernhardsson, Niels Bohr Institute Vincent Blondel, University of Louvain Aaron Clauset, University of Colorado Sergey Dorogovtsev, University of Aveiro Birgitte Freiesleben de Blasio, University of Oslo Thilo Gross, MPI Dresden Kimmo Kaski, Aalto University Beom Jun Kim, Sungkyunkwan University Renaud Lambiotte, FUNDP Vito Latora, Catania University Sune Lehmann, Technical University of Denmark Fredrik Liljeros, Stockholm University Jukka-Pekka Onnela, Harvard University Juyong Park, Kyung-Hee University Veronica Ramenzoni, MPI Nijmegen Martin Rosvall, Umeå University Jari Saramäki, Aalto University Bo Söderberg, Lund University Brian Uzzi, Northwestern University Jevin West, University of Washington *Description*: The main idea is to convene key world-class researchers on complex networks and let them interact freely with the Nordic groups interested in the area. The program will be divided into four thematic areas: biological networks, general network theory, technological networks, and social networks. Many of the intended participants are interested in several of these points. Much progress in network theory has been made by analogies from different ﬁelds, and complex-network researchers value this, therefore we believe such a schedule will not seem unattractive to participants. In addition to the regular schedule during the Nordita program, of one or two talks per day, we will arrange a more intense, three day workshop April 7-9. One purpose of this workshop, is to attract researchers not able to stay the extended time required by the program. This workshop is being organized as part of [a long-program on networks](http://agenda.albanova.se/conferenceDisplay.py?confId=1135) at [NORDITA](http://www.nordita.org/). *Registration deadline*: 15 March 2011 or when 70 participants have registered. ### Footnote [1] I stole this last charming and self deprecating sentence from [Aaron Clauset's blog](http://www.cs.unm.edu/~aaron/blog/archives/2011/01/conference_appl.htm).

Tell a Story!

Sun, 23 Jan 2011 11:43:29 GMT

Although I'm trying to cut down on my podcasts use -- to see if a bit of mind-wandering might be good for my brain [1] -- I still allow myself to listen to podcasts to alleviate the pain of some of the dreariest of chores (e.g. cleaning the bathroom). On those occasions I'm currently working my way though the podcast *[RadioLab](http://www.radiolab.org/)*'s excellent back catalog [2]. While always interesting and informative, the *RadioLab* podcast I listened to yesterday is worth a special shout-out. The podcast featured a simple recoding of the speech co-host Robert Krulwich was invited to give at CalTech's commencement back in 2008 (you can listen to it [here](http://www.radiolab.org/blogs/radiolab-blog/2008/jul/29/tell-me-a-story/)). During the passionate (and funny) speech, Krulwich argues for the value of science communication; not just in general, but also when people ask you about your work: > But because this is your day, and because this person loves you, or because he can’t think of anything to say after “hi,” he asks about your work. And to make it still more interesting, let’s assume that if you explain to this person what you’ve been working on, you might have to use certain words like “protein” or “quark” or “differential” or maybe “hypotenuse.” And if you do, he is going to listen to you very, very politely, but upstairs, those words are going to mean not a whole lot. [...] So ... here’s my question: When you are asked, “What are you working on?” should you think, “There’s no way I can talk about my science with this guy, because I don’t have the talent, or the words, or the patience to do it—it’s too hard, and anyway, what’s the point?” [3] Now, Krulwich argues (and I wholeheartedly agree) that it's important to come up with a *good* answer to this question. ### The 'science story' is a weapon against the 'nut-case story' So the podcast is great, and you should be [listening to it](http://www.radiolab.org/blogs/radiolab-blog/2008/jul/29/tell-me-a-story/), rather than reading this. But just in case you're not convinced, I'll highlight a few of the elements I think are most important. First of all, Krulwich has a good argument as to *why* science communication is important. It's not because there's an intrinsic value in enlightening the spirit of man. It's because reason is at war with all sorts of irrational/crazy causes: > [E]ven if it’s hard to explain, even if you know they don’t really want to hear it, not really, I urge you to give it a try. Because talking about science, telling science stories to regular folks, is important. In a way, it’s crucial. Scientists need to tell stories to nonscientists, because science stories—and you know this—have to compete with other stories about how the universe works, and how it came to be. And some of those other stories—Bible stories, movie stories, myths—can be very beautiful and very compelling. But to protect science and scientists—and this is not a gentle competition—you’ve got to get in there and tell your version of how things are, and why things came to be. So Krulwich makes the excellent point that to most people a story is just a story. And a science-story is no different from a religion-story. The only way to defend science is to tell better stories; to tell stories that are more compelling -- also on an emotional level. ### Are metaphors bad? The other element in the talk that I wanted to highlight is the Krulwich's discussion of the use of metaphor and (potential) lack of precision in science communication: > And yet many scientists remain wary of metaphors, of adjectives [...] But the job we face is to put more stories out there about nature that are true and complex—not dumbed down—and that still have the power to enthrall, to excite, and to remind people that there’s a deep beauty, a many-leveled beauty in the world. What scientists say is hard-won information, carefully hewn from the world. It’s not the offhand opinions of a tribe of privileged intellectuals who look down on everybody. It’s my sense that if more scientists wanted to, they could learn how to tell their stories with words and pictures and metaphors, and people would hear and remember those stories and not be as willing to accept the other folks’ stories. Or at least there’ll be a tug of war, and I think that the science stories will, surprisingly, very often win. To me, the key words here are 'true and complex' coupled with 'a deep beauty'. It's true that you can't *really* explain measurement in quantum mechanics to someone who doesn't know what an [eigenvalue](http://www.quora.com/What-is-the-best-way-to-intuitively-explain-what-eigenvectors-and-eigenvalues-are?srid=X9j) is, etc. But you can still convey the absolute weirdness and wonder of the laws that govern all things quantum. Science itself should not be dominated by metaphor or vagueness; science is about incremental discovery of complex relations. This process of discovery is built on the precision and clarity of scientific colleagues and the giants on whose shoulders we stand [4]. But that doesn't mean that you shouldn't spin an entertaining yarn to explain and motivate your research. So tell a story! Just remember to stay clear of condescension and to stay true to the complex reality that underlies your work. Notes [1] Steven Johnson's excellent new book *[Where Good Ideas Come From](http://www.amazon.com/Where-Good-Ideas-Come-Innovation/dp/1594487715)* suggests that a bit of mind-wandering is one way allow ideas to 'bubble' to the surface (if I remember correctly). I'm not sure that it works, but I guess that a break from the usual near-constant stream of input can't be a bad thing. [2] In case someone's interested, my other favorite podcasts are (1) the absolutely unmissable [Mark Kermode and Simon Mayo's Film Reviews](http://www.bbc.co.uk/programmes/b00lvdrj) which features the best film reviews in the universe (sorry [@ebertchicago](http://twitter.com/#!/EBERTCHICAGO)), (2) [NYT's Book Review](http://www.nytimes.com/ref/books/books-podcast-archive.html), and (3) the classic [This American Life](http://www.thisamericanlife.org/). [3] Text here and in the following was copied from the official transcript. [Download it here [pdf]](http://eands.caltech.edu/articles/LXXI3/Krulwich.pdf). [4] Some attentive readers may have noticed my subtle reference to [Newton's famous metaphor](http://en.wikipedia.org/wiki/Standing_on_the_shoulders_of_giants) (which, according to Wikipedia, doesn't really originate from a Newton quote).

2010 in review

Sun, 02 Jan 2011 15:36:20 GMT

The artificial intelligence engine at WordPress (who hosts this page) sent me an email with some stats on how the site has been doing since I set it up back in June. According to the analysis, the page is "fresher than ever", so I'm delighted. The email even had a convenient button to post the whole thing right at the bottom. And since I haven't posted anything for a while I thought, "why not". No review of my online 2010 would be complete, however, without mentioning the [Twittermood project](http://www.ccs.neu.edu/home/amislove/twittermood/) I did with [Alan Mislove](http://www.ccs.neu.edu/home/amislove/), [YY Ahn](http://www.yongyeol.com/wp/about/), [JP Onnela](http://jponnela.com/), and Niels Rosenquist. That project earned us [302 713 views on YouTube](http://www.youtube.com/watch?v=ujcrJZRSGkg) (at the time of writing) and global press attention with large amounts TV, radio, print, and internet coverage ([click here](http://sunelehmann.com/press/) for full details). Recently, the visualization was mentioned [first among Mashable's best infographics of 2010](http://mashable.com/2010/12/27/best-infographics-2010/), which generated a mini-surge of traffic for the YouTube video. Anyway, the unedited message is below: The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here's a high level summary of its overall blog health: ![Healthy blog!](/assets/20110102_2010_in_review/meter-healthy3.gif) The *Blog-Health-o-Meter™* reads Fresher than ever. ## Crunchy numbers ![Featured image](/assets/20110102_2010_in_review/abstract-stats-3.png) *A helper monkey made this abstract painting, inspired by your stats.* A Boeing 747-400 passenger jet can hold 416 passengers. This blog was viewed about **3,600** times in 2010. That's about 9 full 747s. In 2010, there were **13** new posts, not bad for the first year! There were **38** pictures uploaded, taking up a total of 53mb. That's about 3 pictures per month. The busiest day of the year was July 22nd with **207** views. The most popular post that day was [Worlds Colliding. Part II](http://sunelehmann.com/2010/07/14/worlds-colliding-part-ii/). ## Where did they come from? The top referring sites in 2010 were **twitter.com**, **ccs.neu.edu**, **barabasilab.com**, **iq.harvard.edu**, and **barabasilab.neu.edu**. Some visitors came searching, mostly for **sune lehmann**, **sune lehman**, **sune**, **lehmann sune**, and **sune lehmann nature**. ## Attractions in 2010 These are the posts and pages that got the most views in 2010. 1 [Worlds Colliding. Part II](http://sunelehmann.com/2010/07/14/worlds-colliding-part-ii/) July 2010 2 [About](http://sunelehmann.com/about/) June 2010 2 comments 3 [Press](http://sunelehmann.com/press/) June 2010 4 [Visualizing Link Communities](http://sunelehmann.com/2010/11/03/visualizing-link-communities/) November 2010 1 comment 5 [Mood, twitter, and the new shape of America](http://sunelehmann.com/2010/07/13/mood-twitter-and-the-new-shape-of-america/) July 2010 2 comments

Visualizing Link Communities

Wed, 03 Nov 2010 05:10:43 GMT

When [YY Ahn](http://yongyeol.com/), [Jim Bagrow](http://bagrow.com/), and I published our [paper on communities of links in complex networks](http://www.nature.com/nature/journal/v466/n7307/abs/nature09182.html), we did share the [code](http://barabasilab.neu.edu/projects/linkcommunities/) for the algorithm, but one of the essentials missing from our package was a good way to visualize the highly overlapping link communities. [![Link-communities Visualization](http://dl.dropbox.com/u/153071/website/link-viz.png)](http://scaledinnovation.com/analytics/communities/comlinks.html) Thus, I'm delighted to report that Rob Spencer over at [Scaled Innovation](http://scaledinnovation.com/) has done a great job of [visualizing the detected link communities](http://scaledinnovation.com/analytics/communities/comlinks.html) (including a new client-side implementation, I might add). The technical details are interesting and available. The example displayed above is lifted from Scaled Innovation and shows the network of characters in *The Wizard of Oz*. In addition to the central visualization reproduced above (see below for details), the page also shows the full link dendrogram and many other treats; everything is beautifully crafted. Note the community assignment matrix on the right, which is a neat way of probing the issue of nested communities. On the page, Rob has a number of interesting observations regarding visualization of the link communities and explains the layout above in further detail. I quote: > The good news is that the ABL method is powerful and flexible. The challenge is that the communities it reveals are of links, not nodes, and therefore not as obvious to portray and interpret. So far the literature method is to use a traditional force-based network diagram and color the lines between the dots, rather than color the dots. Not bad, but this has the limitations of force-directed network diagrams have always had: a big "wow factor" but of limited practical interpretive use because of the spaghetti of crossing lines. So here you'll find outright experiments, and that means that some will be different! > > In the upper circular graph the dots are the nodes and the polygons show community membership of those nodes (the colors match the table and dendrogram); line crossing is minimized by working around in cluster-joining order (same as the ROYGBIV color order). Communities are equally distributed around the circle with anchor points shown as black-centered dots; each node is placed as the weighted sum of its coordinates of each anchor to which it belongs, plus some random jitter to separate nodes with single community membership. The community ordering and coloring has an interesting result: the diagram gets simpler to see as the number of communities is increased, even far above the partition density "optimum". > The method is fast because it's completely deterministic and drawn in one pass, i.e. it's not an iterative force-relaxation method. ### Pervasive overlap and visualizations While Rob's visualization shows tremendous progress on a number of fronts (just compare it to [our own - primitive - first stab at visualizing](http://barabasilab.neu.edu/projects/linkcommunities/LinkClusteringFlex.swf) the network of characters in *Les Miserables*), I still think that node based visualizations of the link communities work best when we study ego-networks (a single person and her neighbors). As we point out in the paper, we can visualize the ego-network precisely because the central node's communities are largely non-overlapping. So in the example above, Dorothy is the Ego, placed in the center of the visualization, while the various non-overlapping story lines appear as communities surrounding her. One of the consequences of [pervasive overlap](http://sunelehmann.com/2010/06/29/pervasive-overlap/) (when every node is a member of multiple communities), is that we can no longer display the communities as block structures in the network adjacency matrix. Roughly speaking, to form a block structure, we need a single block per node. Some overlap is possible within the framework of block modeling, but when we can have more communities than nodes, this approach breaks down. A similar problem arises in visualization. My guess is that any strategy for visualizing pervasive overlap where *nodes* are the basis of the visualization will ultimately turn out to be problematic for a full network. One possible solution is to follow the example of [CFinder](http://cfinder.org/) and construct a visualization based on the *network of communities* but with the ability to zoom into each community. At the local level, Rob's visualization would be perfect. Comments/ideas are welcome. Note - this post can also be found at the [Complexity and Social Networks Blog](http://www.iq.harvard.edu/blog/netgov/2010/11/visualizing_link_communities.html).

Twittermood 2: Election special

Mon, 01 Nov 2010 05:44:16 GMT

The midterm elections are coming up, so we decided to create our own little twitter mood election center. "Twitter has grown to become an important aspect of public debate and leading up to Tuesday's midterms, the Twitterverse is abuzz with conversations on the topics that will decide the individual races. It is well known that the state you live in plays a role in deciding what issues you care about. By utilizing the fact that conversations on twitter are public, we can geocode individual tweets, and study *where* Americans are talking about specific issues. In this way, Twitter allows us to extrapolate from millions of water cooler conversations and show where the conversations are taking place *right now*." Check it out by clicking on one of the images below: [![Standard representation](http://twittermood-election.s3.amazonaws.com/images/jobs/jobs.tweets-2010-11-01-05-31.json.norm.png)](http://election.ccs.neu.edu/) [![Cartogram](http://twittermood-election.s3.amazonaws.com/images/jobs/jobs.tweets-2010-11-01-05-31.json.cart.png)](http://election.ccs.neu.edu/) Basically, the idea was to play around with the Twitter stream and do something in real-time for the midterm elections. So we decided to dig into where people are talking about the various issues that are shaping the debate leading up to the election. See the page for full details.

The end of Supporting Material?

Tue, 24 Aug 2010 15:06:51 GMT

Maybe this is how it happens: You see an interesting (seemingly innocuous) paper and decide to read it. Upon finding it very information-dense, you decide to take a look at the supporting information (SI) and notice that the SI has a word count greater in size than an average PhD thesis. Or maybe it's when you decide to print the SI and realize something unusual is going on when your printer is still spitting out paper after half an hour. However you have become aware it, scientific practice has been changing in the last few years. If I remember correctly, supporting information packages started becoming the norm for papers (at least in some journals) a only few years ago and the average SI length has been growing steadily ever since. Now something interesting has happened. From November 1st and onwards, [The Journal of Neuroscience](http://en.wikipedia.org/wiki/Journal_of_Neuroscience) (JNS), a leading Journal in that field, [will no longer allow authors to include supplemental material when submitting new manuscripts](http://www.jneurosci.org/cgi/content/full/30/32/10599) (JNS agrees to link to non-peer reviewed supporting material on the author's own site). The decision is explained in detail by Editor-In-Chief John Maunsell, who presents a lucid and interesting argument. He explains that on one hand, the decision was made to make the task of peer reviewing a paper more manageable, i.e. to help the referees: > Although [JNS], like most journals, currently peer reviews supplemental material, the depth of that review is questionable. Most well qualified reviewers are overburdened with requests to review manuscripts, and many feel that it is too much to ask them to also evaluate supplemental material that can be as extensive as the article itself. It is obvious to editors that most reviewers put far less effort (often no effort) into examining supplemental material. Nevertheless, we certify the supplemental material as having passed peer review. This surely is an accurate description of the situation many referees find themselves in. Going over every equation and argument in a 100 page SI takes several days, an amount of time that most academics simply don't have available. (In fact the current state of peer review, even without mammoth SI's, has been argued to be [suffering from serious problems](http://www.newscientist.com/blogs/thesword/2010/06/we-need-to-fix-peer-review-now.html).) On the other hand the decision is also intended to protect the authors. > Another troubling problem associated with supplemental material is that it encourages excessive demands from reviewers. Increasingly, reviewers insist that authors add further analyses or experiments "in the supplemental material." These additions are invariably subordinate or tangential, but they represent real work for authors and they delay publication. Such requests can be an unjustified burden on authors. In principle, editors can overrule these requests, but this represents additional work for the editors, who may fail to adequately referee this aspect of the review. > Reviewer demands in turn have encouraged authors to respond in a supplemental material arms race. Many authors feel that reviewers have become so demanding they cannot afford to pass up the opportunity to insert any supplemental material that might help immunize them against reviewers' concerns. The "supplemental material arms race" described eloquently above is another element that I, as an author, can relate to---and suspect that many others feel the same. With no room for peer reviewed SI, each manuscript must be self contained and convincing on its own merits: > A change is needed if we are to maintain the integrity and valueof peer-reviewed articles. We believe that this is best accomplishedby removing the supplemental material from the peer reviewprocess and requiring that each submission be evaluated andapproved as a complete, self-contained scientific report [...] With this change, the review process will focus on whether eachmanuscript presents important and compelling results. I think most scientists can agree that large SI's present a challenge to the scientific method as we know it. As is argued by JNS, large SI's present a challenge to referees and authors alike and contain the potential for a potentially harmful "SI arms race". But let's consider the suggested solution. In my interpretation, the proposed solution is to introduce more *trust* into the process. By eliminating the peer reviewed SI, the Editor-In-Chief is effectively stating that referees should trust that the authors have done their legwork (data preprocessing, programming, statistical analysis, and other "boring" elements underlying the main results) properly. Of course, the entire foundation of peer review is trust. As referees we begin our task trusting that authors have done their work properly and presented their results honestly. Even a good referee can only be expected to catch mistakes and problems in the material presented to him. So why not a little additional trust? Personally, I am unsure what to think. On one side, I wholeheartedly agree that there are important problems with the current state of affairs. But, on the other side, I think that there are important arguments against allowing too much of the 'legwork' to left out of the peer review process. Firstly, [examples of scientific misconduct are many](http://en.wikipedia.org/wiki/Scientific_misconduct#Alleged_cases) and the elimination of peer reviewed SI will make sloppy or dishonest science easier. Secondly, and more importantly, as John Timmer at [Ars Technica](http://arstechnica.com/) has recently pointed out, the increasing use of computers could potentially put an end to [the entire concept of scientific reproducibility](http://arstechnica.com/science/news/2010/01/keeping-computers-from-ending-sciences-reproducibility.ars) (precisely because of extensive preprocessing of data, etc). Without peer reviewed SI, this problem will even more difficult to counter. Regardless of the pros and cons, this is an interesting move by JNS. Since JNS allows fairly long articles (typically over ten pages), getting rid of the SI might be easier for JNS and other journals aimed at specific scientific disciplines, than for highly cited interdisciplinary journals - say *Science* or *Nature -* where word-count restrictions for main text are taken very seriously. It will be interesting to see if this policy of "no supporting material" catches on.

Bipartite Network gets a Makeover

Sun, 22 Aug 2010 09:57:54 GMT

I guess my research is slowly changing focus and is more and more about some kind of [data science](http://radar.oreilly.com/2010/06/what-is-data-science.html) (although I still bill myself as a physicist turned network scientist). While statistics and mathematical models are still driving this type of research, an increasingly important part of data science is [visualization](http://twitter.com/suneman/visualization) - finding neat ways to display subtle and complicated mathematical concepts in a way that is immediately understandable. Sometimes, however, visualization can be completely gratuitous eye-candy. Last week, I played around with displaying a weighted bipartite network. One of the default layouts looked something like this: ![](http://dl.dropbox.com/u/153071/website/default.png "Default") Adding Bezier curves, more pleasing node shapes, and a little color, the final network comes across slightly more pleasing to the eye (in my opinion, anyway): ![](http://dl.dropbox.com/u/153071/website/final.png "Final") Stay tuned for the next episode of '[Pimp my](http://en.wikipedia.org/wiki/Pimp_My_Ride) Network'.

Worlds Colliding. Part II

Wed, 14 Jul 2010 09:10:14 GMT

Back in March, I wrote a post entitled [Worlds Colliding](http://www.iq.harvard.edu/blog/netgov/2010/03/worlds_colliding.html) explaining the failure of [Google Buzz](http://en.wikipedia.org/wiki/Google_Buzz) as a failure to understand the fundamental structure of complex networks. Buzz received a [large amount of criticism](http://news.cnet.com/8301-1001_3-10456422-92.html) for automatically adding the most contacted people from your inbox to your Buzz follower list. My post explained that because individuals in social network are a member of many social contexts (family, work, friends, etc), nodes from all of these to a single list would cause these contexts to collide (e.g. adding both your wife and your (no longer) secret mistress to your list of followers). The last couple of days, the following talk (from July 1st) by [Paul Adams](http://twitter.com/Padday) who is a User Experience Researcher at Google has been very visible on the interwebs. [slideshare id=4656436&doc=vtm2010-100701010846-phpapp01&type=d] From the looks of it, the good people at the Googleplex have either been [reading my blog](//www.iq.harvard.edu/blog/netgov/2010/03/worlds_colliding.html) and the [accompanying scientific paper](http://www.nature.com/nature/journal/vaop/ncurrent/abs/nature09182.html) and are scrambling to keep up (I consider this scenario *highly unlikely*) or, the User Experience Group at Google was never in touch with the group behind Buzz. Let me repeat that last part for dramatic effect: *the User Experience Group at Google was never in touch with the group behind Buzz*. The knowledge about [pervasive overlap](http://www.iq.harvard.edu/blog/netgov/2010/06/pervasive_overlap.html) and overlapping communities was present within Google, but never diffused to their initial social networking attempt. So the failure of Buzz was in some sense due to separate worlds within Google not communicating properly. That strikes me as textbook case of tragic irony. ### Update, July 15th I've included [YY](http://yongyeol.com/)'s recent slides from the [New Frontiers in Complex Networks](http://cnrc.snu.ac.kr/conference201007/#) conference as a quick intro to our thinking regarding pervasive overlap. [slideshare id=4763055&doc=2010linkclustering-100715083904-phpapp01] The proper reference is [Link communities reveal multiscale complexity in networks](http://www.nature.com/nature/journal/vaop/ncurrent/abs/nature09182.html). Nature (2010), doi:10.1038/nature09182.

Mood, twitter, and the new shape of America

Tue, 13 Jul 2010 15:21:35 GMT

Twitter is a gigantic repository for our collective state of mind. Every second, thousands of tweets reveal what everybody [and their mother](http://tweepml.org/Mothers/) [had for lunch](http://twitter.com/Mickey__Rourke/status/18348383163), what [Justin Bieber](http://en.wikipedia.org/wiki/Justin_Bieber) is up to, or what [magnificent link you should be checking out](http://twitter.com/suneman) right now. Individually, each tweet is mostly interesting to friends/fans of the tweeter, but taken together they add up to something more. In analogy to individual neurons firing together to add up to the human consciousness, the billions of tweets have meaningful macro-states that contain information about the whole system rather than the individual tweeters. But we need to do a little data mining to extract meaningful information about these states, to expose our collective states of mind. As a proof-of-concept we've1 been studying the mood2 of all of the public tweets. While there are many services that will allow you to study the [mood of your own tweets](http://tweetmood.info/) ([and also an neat little DIY project to show you the global average of twitter](http://www.instructables.com/id/Twitter-Mood-Light-The-Worlds-Mood-in-a-Box/)), much less effort has gone into studying how the mood breaks down according to geography. Below, I show a brand new video displaying the pulsating 24-hour twitter mood cycle of the United States (I'll explain just what you're looking at, in the following).

In the video, green corresponds to a happy mood and red corresponds to a grumpier state of mind. The area of each state is scaled according to the number of tweets originating in that state. Note how the East Coast is consistently 3 hours ahead of the West Coast, so when we're sleeping in Boston, the Californians are tweeting away. It's also interesting that better weather seems to make you happier (or rather, that better weather is correlated with happier tweets): Florida and California seems to be consistently in a better mood than the remaining US. Also note how New Mexico and Delaware behave very differently from their neighbors. Full results, individual maps, and a high-res poster can be found on the dedicated [Twitter Mood website](http://www.ccs.neu.edu/home/amislove/twittermood/). ### How to construct the mood map Since many twitter users list their location, we've assigned every tweet in our (massive) database to a US county and extracted their mood. This allows us to average over tweets and plot the mood of the US as a function of geography (and time). However, since the US is unevenly populated, the resulting maps are boring since only a few counties (the centers of cities) contain most of the tweets (not too many tweets in [Ellsworth, Nebraska](http://maps.google.com/maps?f=q&source=s_q&hl=en&geocode=&q=nebraska&sll=41.6725,-70.3514&sspn=0.009761,0.017166&ie=UTF8&hq=&hnear=Nebraska&ll=42.057737,-102.277608&spn=0.019405,0.034332&z=15) yet). Luckily, brilliant people have come up with a cool way of solving this problem using a technique called [density equalizing maps](http://www.pnas.org/content/101/20/7499.abstract)3. (or [cartograms](http://en.wikipedia.org/wiki/Cartogram)) The idea here is simple: warp the map in such a way that certain features of shape are conserved, but in such a way that the (population) density becomes the same everywhere. The resulting maps look like something from an alternate universe and allow us to show the US mood much more clearly. ### Notes 1. The twittermood project members are [Alan Mislove](http://www.ccs.neu.edu/home/amislove/), [YY Ahn](http://yongyeol.com/), [JP Onnela](http://jponnela.com/), Niels Rosenquist, and undersigned. 2. For a deeper explanation of how we evaluate the mood of tweets, see the [Twitter Mood website](http://www.ccs.neu.edu/home/amislove/twittermood/). 3. An easily accessible explanation of the density equalizing maps, is posted on the [Twitter Mood website](http://www.ccs.neu.edu/home/amislove/twittermood/).

Erdös Number

Thu, 08 Jul 2010 10:37:13 GMT

The scientific version of the [Bacon number](http://en.wikipedia.org/wiki/Bacon_number) is the [Erdös number](http://en.wikipedia.org/wiki/Erd%C5%91s_number). Via [a post on Finn Nielsen's blog](http://fnielsen.posterous.com/from-the-folklore-of-network-analysis-the-erd), I learned that i have a reasonably low Erdös number - ***three***. (I also learned that Finn is one of the few people with a finite [Erdös-Bacon number](http://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93Bacon_number)). The reason for both Finn's and my own low Erdös number, is that my PhD advisor Lars Kai Hansen has co-authored a (highly cited) paper with Peter Salamon who has a bacon number of one. The links are: - P. Salamon and P. Erdös. The Solution to a Problem of Grünbaum, *Canadian Mathematical Bulletin*, **31**: 129-138 (1988). - L.K. Hansen and P. Salamon. Neural Network Ensembles, *I.E.E.E. Transactions on Pattern Analysis and Machine Intelligence*, **12**: 993-1001 (1990). - S. Lehmann, M. Schwartz, L.K.Hansen. Biclique communities. *Physical Review E* **78**:016108 (2008). With respect to the Erdös-Bacon number, I could make the case that I should have a number of four. The reason is that I actually appear in the documentary (it's just an uncredited half-second shot of me sitting at my computer) *[Connected - The power of six degrees](http://www.imdb.com/title/tt1310375/), which features my ex-boss and renowned scientist [Albert-Laszlo Barabási](http://www.barabasi.com/).* Here's the trailer:

But since I don't appear on IMDb, I guess it doesn't really count...

Pervasive Overlap

Tue, 29 Jun 2010 15:07:08 GMT

Just recently, I came across the following video showing [LinkedIn](http://www.linkedin.com/) chief scientist [DJ Patil](http://www.linkedin.com/in/dpatil) explaining the egocentric networks (networks consisting of an individual and their immediate friends) for a few individuals based on their LinkedIn connections.

Although the individuals in the center of these egocentric networks are unusual (in the sense that they have many more LinkedIn connections than the average user), the video clearly shows that each person is a member of *multiple communities* where the communities are dense and almost fully connected, while there are fewer connections between the communities. (If any of this sounds familiar, it's because [I wrote about this subject](http://www.iq.harvard.edu/blog/netgov/2010/03/worlds_colliding.html) a couple of months ago on the [Complexity and Social Networks Blog](http://www.iq.harvard.edu/blog/netgov/)). This notion of social structure implies that -- seen from the perspective of a single node -- everything is relatively simple: the world breaks neatly into easily recognizable parts (e.g. family, co-workers, and friends). There are few or no links between the communities because we actively work to keep them separate ([more here, on why this is the case](http://www.iq.harvard.edu/blog/netgov/2010/03/worlds_colliding.html)). I've been thinking about the consequences of this local structure for a while, and recently coauthored a paper this subject with [YY Ahn](http://yongyeol.com/) and [Jim Bagrow](http://bagrow.com/) [1]. Here, and in an upcoming blog post, I'll be writing about some insights from that work. The idea I hope to explore here has to do with the *global structure* that arises when all nodes in a network have multiple community affiliations, when there is pervasive overlap. In the follow up, I'll explore how a single hierarchical organization of the network can exist in the presence of pervasive overlap. ### Untangling the hairball In the standard view of communities in networks, the global structure is modular [2]. This situation is shown below (left), where the communities are labeled using different colors (image from [gephi.org](http://gephi.org/)). Modular structure on the global level implies, however, that individual nodes can have only a single community affiliation! | | | | --- | --- | | | | If every node is a member of more than one community -- and this is clearly the case in the LinkedIn example, as well as in real social networks -- *then the global structure of the network is not at all modular. Rather, the network will be a dense mess with no visually discernible structure. The network will look like ball of yarn ... or a hairball (*above, right). In fact, this is precisely the type of structure which has recently been discovered in empirical investigations of a comprehensive set of large networks (social and otherwise) [2, 3]. So the question becomes: *How do we find network communities in the hairball?* This is the question YY, Jim and I answer in Ref [1]. The trick is that although *nodes* have many community memberships, each *link* is mostly uniquely defined. For example, the link you have to one coworker is similar to the link you have to other coworkers. Thus, by formulating community detection as a question of categorizing links rather than nodes, we are able to detect communities in networks with pervasive overlap. Using our algorithm, for example, we show that dense hairball-networks, such as the [word association network](http://web.usf.edu/FreeAssociation/) (which is what is pictured above, right) contain highly organized internal structure with well defined and pervasively overlapping communities. We're hoping that our algorithm will help reveal new insights about some of the many highly overlapping social networks, such as the LinkedIn data shown above. Code for our algorithm may be downloaded [here](http://barabasilab.neu.edu/projects/linkcommunities/); that site also features a neat interactive visualization of the link clustering algorithm. **Note**: This entry was originally posted on the [Complexity and Social Networks Blog](http://www.iq.harvard.edu/blog/netgov/2010/04/). ## References - [1] Yong-Yeol Ahn, James P. Bagrow and Sune Lehmann. [Link clustering reveals multiscale complexity in network](http://www.nature.com/nature/journal/vaop/ncurrent/abs/nature09182.html). *Nature.* doi:10.1038/nature09182 (2010). - [2] Santo Fortunato. [Community detection in graphs](http://sites.google.com/site/santofortunato/review_comm.pdf?attredirects=0). *Physics Reports* **486**, 75-174 (2010). - [3] J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. [Statistical Properties of Community Structure in Large Social and Information Networks](http://cs.stanford.edu/people/jure/pubs/ncp-www08.pdf) *International World Wide Web Conference (WWW)* (2008). - [4] J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. [Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters](http://arxiv.org/abs/0810.1355). *arXiv:0810.1355v1* (2008).

An animation

Wed, 23 Jun 2010 13:42:57 GMT

A couple of days ago, [David Lazer](http://www.hks.harvard.edu/davidlazer/html/) asked me to help him generate an animation of the spread of MFN (trade) treaties in the 1860s.

His resulting post on *[The emergence of international order: The case of MFN treaties in the 1860's](http://www.iq.harvard.edu/blog/netgov/2010/06/the_emergence_of_international_order.html)* is worth a read! Also, David's post sparked a nice [follow up blog post](http://www.themonkeycage.org/2010/06/showing_the_spread_of_mfn_agre.html) focusing on the role of animations as a network research tools.

Starting a real page

Tue, 22 Jun 2010 18:41:59 GMT

I wanted to start a real page, to upgrade from my somewhat one-dimensional page over on . As you can probably see, I'm far from done, but I will be updating the site and posting more in the following few days/weeks.

High Throughput Humanities: Final Call for Abstracts

Thu, 22 Apr 2010 16:41:31 GMT

Note: This post was originally posted on the [Complexity and Social Networks Blog](http://www.iq.harvard.edu/blog/netgov/2010/04/). A quick reminder that April 30th is the final chance to submit an abstract to the [High Throughput Humanities](http://hth.eccs2010.eu) Workshop that I'm organizing along with [Riley Crane](http://www.rileycrane.com/Index.html) , [Gourab Ghoshal](http://barabasilab.com/personnel/who.php?who=gghoshal), and [Max Schich](http://www.schich.info/), at this years [European Conference on Complex Systems](http://www.eccs2010.eu) in Lisbon this September (I wrote about this in more detail [a couple of months ago](http://www.iq.harvard.edu/blog/netgov/2010/03/high_throughput_humanities.html)). We have an amazing Program Committee that includes: **Albert-László Barabási**, CCNR Northeastern University, USA. **Guido Caldarelli**, INFM-CNR Rome, Italy. **Gregory Crane**, Tufts University, USA. **Lars Kai Hansen**, Technical University of Denmark. **Bernardo Huberman**, HP Laboratories, USA. **Martin Kemp**, Trinity College, Oxford, UK. **Roger Malina**, Leonardo/ISAST, France. **Franco Moretti**, Stanford University, USA. **Didier Sornette**, ETH Zurich, Switzerland. Full details can be found at the workshop website . There's even a neat little introductory video (from our talk at [Ignite Boston 7](http://igniteboston7.eventbrite.com/)):

We hope you will submit an abstract!

Worlds Colliding

Tue, 23 Mar 2010 16:18:48 GMT

During a [press conference at last week's SxSW conference](http://lifehacker.com/5493475/what-the-gmail-team-is-working-on), product manager of Google's gmail team, Todd Jackson, revealed an interesting bit of information about the company's [problem-ridden](http://news.bbc.co.uk/2/hi/technology/8517613.stm) new service [Google Buzz](http://www.google.com/buzz): > Jackson told the crowd, as he's previously said to reporters, that too much was assumed about how Buzz would work best and be received based on Google's internal testing. Google employees didn't have a strong use case for "muting" their fellow Google employees, and the people they'd want to follow and be followed by closely matched up to their contact lists. In general, too, Jackson suggested that Google underestimated the impact of "having a social, public service appear inside ... what is a very private thing (email) for some people [[1]](http://lifehacker.com/5493475/what-the-gmail-team-is-working-on). So by testing their social service inside a single context (Google employees only), the developers failed to notice that in real life, people participate in *multiple* contexts (family, work, friends, etc) that they work actively to keep separate. The reasons for wanting to keep these groups separate can range from wanting to keep an illicit affair secret from your spouse to political activists in oppressive regimes wanting to keep certain connections secret from the government [[2]](http://news.bbc.co.uk/2/hi/technology/8517613.stm). Another important reason to keep our communities separate, is that we often play different roles - and communicate differently - in different contexts, as illustrated beautifully in the following clip from TV's Seinfeld:

So, ironically, the key problem for Buzz, Google's *social network* service was that the engineers at the Googleplex had failed to understand an essential property of real-world social networks. Figure 1 illustrates the problem:![google_vs_real](/assets/20100323_worlds_colliding/google_vs_real.jpg)Figure 1A shows a cartoon version of Google's internal testing situation. It's clear that in this situation, since an individual (the gray node) only belongs to a single social context, sharing contact information with his neighbors reveals no new information to his social network. However, an ego-centered network in the wild looks more like the situation depicted in Figure 1B. Here, the gray node is a member of several communities (nodes with different colors) with very little communication between communities. Now, because people typically manage all of their 'worlds' from their email inbox, what Google did when they created Buzz' automatic friends-lists, was to implicitly link people's worlds, revealing the precisely the information that people work to supress. Sometimes with [serious implications](http://techcrunch.com/2010/02/12/google-buzz-privacy/). It is interesting to consider what the structure displayed in Figure 1B implies for the full graph. For an individual, the world breaks neatly into a small set of social contexts, but when every single node is in this situation, then the resulting total structure becomes very different from many of the model networks that are currently in use. In my own corner of the complex networks world, this has serious implications for rapidly growing field of *community detection* [3]. Currently, most algorithms are designed to search for densely connected sets of nodes that are weakly connected to the rest of the network, and while some methods do include the possibility of community overlap, most break down if the overlap constitutes more than a small fraction of the number of nodes. If Figure 1B is correct and overlap is present for all nodes, then the idea of communities as weakly connected to the remainder of the network is false -- since communities will have many more links to the outside world than to the inside. I hope to see more research investigating this problem! Oh - and [George Costanza](http://en.wikipedia.org/wiki/George_Costanza) gets to have the last word...

Update April 3rd, 2010 I've just become aware of a few excellent blog posts that discuss problems related to buzz, drawing on ideas very similar to what I present above. Fred Stutzman writes eloquently about buzz and colliding worlds inspired by Erving Goffman [here](http://fstutzman.com/2010/02/16/what-google-could-learn-from-goffman/). That post sparked [additional 'world-colliding' thoughts](http://) from David Truss (via [this post](http://www.elearnspace.org/blog/2010/02/18/what-google-could-learn-from-goffman/) from George Siemens). **Note**: This post was originally posted back in 2010 at the now defunct *Complexity and Social Networks Blog* (which was hosted here https://blogs.iq.harvard.edu). References - [1] http://lifehacker.com/5493475/what-the-gmail-team-is-working-on - [2] http://news.bbc.co.uk/2/hi/technology/8517613.stm - [3] Santo Fortunato. Community detection in graphs. *Physics Reports* **486**:75-174 (2010).

High Throughput Humanities

Fri, 05 Mar 2010 16:30:29 GMT

Note: This post was originally posted on the [Complexity and Social Networks Blog](http://). Along with [Riley Crane](http://www.rileycrane.com/Index.html) (of [Darpa Challenge](https://networkchallenge.darpa.mil/Default.aspx) and [Colbert Report](http://www.colbertnation.com/the-colbert-report-videos/260725/january-05-2010/riley-crane) fame), physicist [Gourab Ghoshal](http://barabasilab.com/personnel/who.php?who=gghoshal), and quantitatively minded art historian [Max Schich](http://www.schich.info/), I'm putting together a workshop on [High Throughput Humanities](http://hth.eccs2010.eu) as a satellite meeting at this years [European Conference on Complex Systems](http://www.eccs2010.eu) in Lisbon this September. The general idea is to put together people who ask interesting questions of massive data sets. More specifically - as the title implies - we want to figure out how to use computers to do research in the humanities in a way extends beyond what can currently be accomplished by human beings. Entire libraries are in the process of being scanned and we would like to begin to investigate questions like: Are there patterns in history that are currently 'invisible' due to the fact that humans have limited bandwidth - that we can only read small fraction of all books in a lifetime? We have an exciting program committee so it should be an interesting day! **Confirmed Programme Committee Members** - Albert-László Barabási, CCNR Northeastern University, USA. - Guido Caldarelli, INFM-CNR Rome, Italy. - Gregory Crane, Tufts University, USA. - Lars Kai Hansen, Technical University of Denmark. - Bernardo Huberman, HP Laboratories, USA. - Martin Kemp, Trinity College, Oxford, UK. - Roger Malina, Leonardo/ISAST, France. - Franco Moretti, Stanford University, USA. - Didier Sornette, ETH Zurich, Switzerland. Practical information can be found at the [conference website](http://hth.eccs2010.eu/). Oh, and did I mention that Lisbon is *beautiful* in September! Sign up an join us. The workshop abstract is reprinted below. **Abstract** The High Throughput Humanities satellite event at ECCS'10 establishes a forum for high throughput approaches in the humanities and social sciences, within the framework of complex systems science. The symposium aims to go beyond massive data aquisition and to present results beyond what can be manually achieved by a single person or a small group. Bringing together scientists, researchers, and practitioners from relevant fields, the event will stimulate and facilitate discussion, spark collaboration, as well as connect approaches, methods, and ideas. The main goal of the event is to present novel results based on analyses of Big Data (see NATURE special issue 2009), focusing on emergent complex properties and dynamics, which allow for new insights, applications, and services. With the advent of the 21st century, increasing amounts of data from the domain of qualitative humanities and social science research have become available for quantitative analysis. Private enterprises (Google Books and Earth, Youtube, Flickr, Twitter, Freebase, IMDb, among others) as well as public and non-profit institutions (Europeana, Wikipedia, DBPedia, Project Gutenberg, WordNet, Perseus, etc) are in the process of collecting, digitizing, and structuring vast amounts of information, and creating technologies, applications, and services (Linked Open Data, Open Calais, Amazon's Mechanical Turk, ReCaptcha, ManyEyes, etc), which are transforming the way we do research. Utilizing a complex systems approach to harness these data, the contributors of this event aim to make headway into the territory of traditional humanities and social sciences, understanding history, arts, literature, and society on a global-, meso- and granular level, using computational methods to go beyond the limitations of the traditional researcher.