Data Science

Inside FollowGraph: Explore the Technique Behind the Technology

Lyden Foust


July 19, 2023

If you are looking for more detail on how FollowGraph was developed, this is it. 

FollowGraph quantifies consumer interest in brands, influencers, media, and more using social media following data. The idea for FollowGraph came from a question we got at the ICSC Research & Connections conference: “Instead of mapping behavior, could you map brand hashtags?”.

The assumption was that brand hashtags would indicate a strong level of purchase intent. However, at, we learned that hashtags were weak signals of intent because they aren’t used frequently and the sentiment is often unclear. But the question was still relevant: “How can you map brand intent?”

The lightbulb moment came during the development of PersonaLive when we discovered we could rank segments on an index of how many people were following a social media account. Hashtags weren’t the thing. It was following. When you follow a brand, the sentiment is clear; you want that brand to show up in your feed and to see more of that content. 

In other words, people might sporadically hashtag #Wendys whether having a good or bad experience, but the only reason you’d follow @Wendys is if you love the brand. 

Dataset Structure

And that is what FollowGraph is; it’s a way to map ‘brand love’. 

More specifically, it’s a way to map love for social media accounts because there is more to social media than just brands. FollowGraph variables have a top-level category and subcategory. Top-level categories include:

  • Brands (Audi, Apple, etc.)
  • Interests (Dog Enthusiasts, Vegans, Wine Lovers, etc.)
  • Channels (Websites, TV Channels, Magazines, News, etc.)
  • Influencers (Barack Obama, Sofia Vergara, Greta Thunberg, etc.)

The dataset is generated at the census block group level for all 2,000+ variables. Interest variables in FollowGraph are unique groupings of accounts that collectively signify a strong interest in a theme. For instance, the ‘Athleisure Enthusiast’ variable consists of those following multiple athleisure brands like @aloyoga, @lululemon, and @Athleta.


FollowGraph scores every census block group on an index where 100 = average. A score of 200 would indicate people in that block group are 2x more likely to follow the account than average.

People in this block group have an index of 257. They are 2.57x more likely to follow @lululemon. 


There are four steps to building FollowGraph:

  1. Connect. Connect public social media accounts to location to form a bridge between online social behaviors and geography.
  2. Calibrate. Integrate billions of data points from social, mobile, and demographic variables to calibrate AI models.
  3. Categorize. Predict each block group's likelihood to follow any of the 2,000+ social media accounts and interests.
  4. Calculate. Calculate an index for every block group compared to the nation to provide a precise view of population behaviors.

FollowGraph is derived from the analysis of social following data from publicly available accounts to assess the following rates. Each social account is geocoded and connected to area and individual characteristics. Each variable is then individually modeled using these area and individual characteristics to generate predicted following rates for individuals and neighborhoods. 

For example, the variable ‘Pepsi’ is computed using accounts that follow @Pepsi. Using machine learning techniques, two unique models are generated to predict Pepsi following. The first model predicts the following rates using only area characteristics and the second enhances the first model by including individual characteristics. 

Model inputs include:

  1. Census and individual demographics
  2. Movement data
  3. Social media data

The predicted likelihood to follow any social account or group of accounts is indexed to the national average (individual or CBG follow rate /  national follow rate). The result is a series of indices for geographic areas (and individuals) which measure the propensity of consumers to follow a wide array of social media accounts relative to the national average.

Why Use FollowGraph? 

The strength of FollowGraph is in its wide breadth and straightforward approach to quantifying interests across geography. Unlike traditional datasets, it is based on real behavioral data, not surveys. Further, the social-media-based approach creates an actionable set of variables that can help users map interests, guide marketing spend, identify ideal locations, and more.

Sample Data

Sample data is available for FollowGraph: View sample data

How PersonaLive Segmentation System Works

Personalive segmentation uses social media, mobile foot traffic, online activity, and individual-level demographics to organize every US household into one of 80 behavioral segments. These segments provide visibility to the online and offline preferences of the customers visiting any US property.

01 Append

Draw a polygon around a property to identify the behavioral segment of every visitor.

02 Analyze

Rank the top customer types visiting a location. Then match retailers based on online and offline activities of visitors.

03 Activate

Demonstrate visitor brand affinity to close deals. Activate marketing campaigns to drive target segments to your location.

Related posts

Data Science

Predicting CPG Sales: A Comparison of Demographics vs Traditional Segmentation vs PersonaLive

Elizabeth Gilbert
Jack Schroder


August 11, 2021
Data Science

3 Ways To Segment Foot Traffic Data

Jack Schroder


April 12, 2022

Ready to get started with Geosocial data?