Here at Polinode we are incredibly excited about attending the SIOP Annual Conference for the first time. For those that may not have been before, the SIOP Annual Conference will be held in Chicago between the 17th and 20th of April and is the annual conference for the Society for Industrial and Organizational Psychology. It will see thousands of individuals (both researchers and practitioners), all of whom have an interest in I-O Psychology, come together for a few days to network and listen to hundreds of presentations on a wide variety of topics.
Given this will be our first SIOP conference, we thought it might be fun (and helpful for both ourselves and others) to use some publicly available data to create some networks that assist with understanding the conference (and provide some insight into the I-O Psychology community more generally). This blog post summarizes how we did this and includes some links to a few interactive networks so that you can explore them in more detail yourself. We hope you enjoy it and we’d love to hear your feedback and comments (via LinkedIn or at info@polinode.com).
SIOP 2024 Program Data
The first couple of networks that we are going to look at are based on the data that can be found in the program for the conference. This program contains the details of the ~200 events/presentations that will be held at the conference and the ~3,000 presenters that will be participating at the conference. We were able to scrape this data and have inserted it into a Baserow table.
The Presenter to Event Network
The first network that we created with the SIOP Program data was the Presenter to Event network. Here we have created an edge between each individual presenter and the events that they are participating in according to the program. Many of the event types are Symposiums, Panel Discussions or similar such that there are often multiple presenters for an event and many of the presenters are participating in more than one event. Which is all to say that we have quite an interesting network of presenters to events.
At this point we should point out that in this network we have included all event types except for the Poster event type. We would like to give a huge shoutout to all of the individuals presenting a poster at the SIOP Annual Conference this year - both authors have presented posters at academic conferences before and are very familiar with all of the work that goes into. The primary reason that we have excluded those making poster presentations here is that the nature of poster presentations doesn’t lend itself to the interconnectivity that is characterized by the other event types. Notwithstanding this, we do hope that the networks that we have produced here may prove to be a valuable resource for those that are presenting posters at the conference.
And so here is the resulting network showing the relationships between the 3,059 remaining presenters and 903 events:
At Polinode, we are particularly interested in all things People Analytics. So, one of the first things that we did was search for “People Analytics” in this network and, of course, we quickly found Cole Napper. Cole is understandably connected to Scott Hines (co-host of the Directionally Correct podcast) via the event “LIVE: Directionally Correct Podcast With Special People Analytics Guests”.
Very close to Cole in the network is none other than Richard Rosenow (just slightly above Cole). If you are active in the People Analytics community, Richard is also someone that you are likely very aware of. No doubt Richard will be keeping himself very busy over the course of the conference at the One Model booth but will also be running a Master Tutorial at the end of Friday the 19th of April on Machine Learning Ops for I-O Psychologists.
And, not unsurprisingly, close to the center of the network we find the event “Invited: Meet the Experts: A Panel of Leading Researchers and Practitioners in I-O” which includes some of the leading minds in I-O Psychology including Deniz Ones, Richard Landers, Nancy Tippins and Tara Behrend (just to name a few). It’s easy to click on any of these individuals to see what other events they are participating in.
There is of course a lot more to this network and we would encourage you to explore it interactively via this link. You will generally find groups of similar or related events are positioned relatively close to each other - for example, there is a small cluster of People Analytics events towards the middle left, a cluster of mainly Machine Learning and NLP events towards the center right and quite a few DE&I focussed-events towards the bottom center.
The Presenter to Presenter Network
The next network that we will take a look at is based on the same set of data as the previous network but instead of linking presenters and events we instead create an edge between two SIOP 2024 presenters if they are participating in at least one event together.
Below you will find a screenshot of this Presenter to Presenter network and here is a link to the interactive version in Polinode. The network below is sized by the total number of connections that each person has (which will be higher the more events that they are participating in and the more people that are participating in each of those events). We have also applied a community detection algorithm that partitions the network into groups of people that are more closely related to each other than they are to the rest of the network. If you would like some further insight into what these different informal communities or groups represent then please ask us for a copy of your personal network report (but more on that later).
In addition to Total Degree, we have also calculated betweenness centrality for this network - betweenness centrality measures how much of a bridge or broker each person is in the network in the sense that they connect groups of people that otherwise would not be well connected. In the view below we have sized the nodes in the network by betweenness centrality and added some labels to those nodes that have the highest betweenness centrality (the top 10 by betweenness centrality to be precise):
Google Scholar and OpenAlex Data and an Important Note on Data Quality
Having created the above networks, we started to think about what additional public data exists that might provide additional insight for the global I-O Psychology community. Given that many of the individuals in this community are researchers we naturally turned to data on publications. This should by no means diminish from the important role that practitioners, who tend to be less focused on publishing, play in this community and our hope is that the insights generated from this analysis of publications data will also be of use to practitioners.
There are two data sources that we used for information on publications - Google Scholar and the OpenAlex database. Unlike Google Scholar, OpenAlex is friendly towards programmatic access and makes available a well structured API that allows us to query by name and institution in order to retrieve a list of publications for many of the SIOP 2024 presenters. We used Google Scholar data on “interests” though where it was available - by interests we mean the short text categories researchers can enter on their own profiles as indicated in the below for Tara Behrend via the red box.
So, for the two previous networks as well as the two networks that we examine below, we have attached summary data from OpenAlex (such as an author’s number of works and the number of times their work has been cited) as well as the interests data from Google Scholar (where available).
It’s important here to highlight the limitations of the data that we have used. In order to lookup the publications data and the interests data we ran a search on name and institution as well as just name across both OpenAlex and Google Scholar. We used a combination of signals to determine whether a match was a high quality match such that we included the data from it. In reviewing the matches though it became clear that entity resolution was an issue for some authors. By this we mean that the publications data for a result included the publications from multiple distinct people, e.g. Jane Citizen and Janet Citizen or similar. Where the data indicated that this was the case, we excluded that match from the data. Let us illustrate with an example - Eduardo Salas. If you review the OpenAlex profile for Eduardo Salas it is clear that the most commonly cited works are in the psychology domain but that some other authors have been included that publish in different domains - this publication on human platelets has been credited to Eduardo Salas as well. Unfortunately we had to exclude the publications data on Eduardo Salas and a handful of other SIOP 2024 authors as it was not sufficiently clean. We apologize to any SIOP 2024 Presenters that we were forced to exclude from the retrieval of publications data but note that many of these excluded individuals will still feature in the networks below as we have retrieved the publications data of many of their co-authors.
In total we were able to match 1,039 SIOP 2024 authors to their publications data. Quite a high number considering many practitioners and graduate students are not present in the data sources we are making use of.
Co-authorship Network Seeded with SIOP 2024 Presenters
We utilized the 1,039 matches described above as seeds in order to construct a co-authorship network. That is to say that we retrieved all of the publications for these 1,039 matched SIOP 2024 presenters and created the network between them and their combined co-authors where two people are connected in that network if they have co-authored at least twice together. Altogether we retrieved data on 42,193 publications and the resulting network has a total of slightly over 13,000 nodes (in the major component that is - there are ~38,000 individuals in the network altogether before filtering edges out and limiting the network to the major component). This network is displayed below.
One of the questions that we were interested in answering with this data is which SIOP 2024 presenters had co-published with the most other authors (in the broader group of ~13,000 people) and the top 10 by this metric are listed below:
We were also keen to understand who was not attending SIOP 2024 but was well connected to this broader group of ~13,000 people and that list is provided below:
Observant readers may notice that Eduardo Salas is included in the list of the most connected SIOP 2024 authors above even though we have previously indicated that his publications data was not directly retrieved. The reason for this is because the publications data of many of his co-authors was retrieved and we went through the list of all authors in the network with more than 30 connections to verify whether they were attending SIOP 2024 or not.
Focusing on the SIOP 2024 Presenters Co-authorship Network
We are very interested in how the set of SIOP 2024 presenters are connected to each other and therefore decided to create a new network based on the previous network but with the important difference that it contains only the SIOP 2024 presenters themselves. That is to say, we used the co-authorship data to understand the relationships (based on publications) between the SIOP 2024 presenters themselves. The network below captures exactly this. Here we have sized the edges by the number of times two individuals were co-authors so a thicker edge indicates a greater degree of collaboration. In this network we have filtered the edges such that only relationships with two or more co-authorships have been included. We have also included only the major component of the network (i.e. the largest set of connected individuals). In this subset of the network there are 699 individuals.
It’s important to understand that this network represents quite a few decades of collaboration such that individuals who have been publishing in I-O Psychology for longer will tend to have stronger connections and will likely have more collaborators. In this network we have also calculated the informal communities and colored nodes by them and we have sized each node by the total number of distinct co-authors they have in the network.
It’s interesting to examine what these different communities represent. Certainly there are groupings by institution / geography and also by interest. We do some further analysis on these communities in the individual network reports (see below) and summarize below the most common institution in each of these communities.