Wednesday, December 31, 2014

Interactive Simple Networks

This post isn't anything new in terms of analysis, but just a cooler look at a previous post.  I looked at board members of large companies in a previous blog post and showed via a simple network how they share board members and provided some commentary about that in terms of how board membership represents desired influence in markets.

Below is an image of those boards and companies that is linked to an interactive network (really just moves around using the very cool networkD3 package in R) that is bit more fun that static network images.

You can click the nodes to highlight what they are and you can move them around.  For analysis purposes, not necessarily breakthrough, but the visualization is very cool.

  

Sunday, December 21, 2014

Winning a Marathon (Part 2)

In a previous post I looked at a data set published by the AARRS that provides a lot of data on marathons around the world and specifically the winning times of every* race.  After spending a bit more time with the data there are a few more things we can take from this data that may be more helpful for personal use.

As mentioned before, the data includes ultra-marathons, trail-runs, etc.  In an effort to extract those to get only road races I've filtered the data to include only those races with at least 200 participants (male/female so 200 male participants at least or 200 female participants).  Still there are some non-road races in the data that have 200+ participants, but far less than before.  So, is the data totally "cleaned" of these races, no.  But, I think this gets us closer to the finishing time(s) people are running to win "normal" marathon road races.


In this case the average winning time is about 2:35:00 for male winners.  We can assume that this would come down slightly with a few more of the ultra-races stripped out.  You can see different race names as you put your cursor over the point (thanks plot.ly!).  This is potentially helpful for finding a race to win that's within your race time.  In the past 10 years the times haven't changed dramatically (contrary to the graph that included all marathon and ultra distances).  Certainly more races were available the past few years than those before, but it seems that those races are all run just as fast as the others.  

Female winning times have also stayed consistent over the past 10 years for races with more than 200 finishers.  


The average time for Female winners is around 3:02:00 for the last 10 years.  Again, much lower time than had we included all races in the data set without some filtering.

These graphs were only of races in the US.  In general, without having personal knowledge of the race, (terrain, temperature, organization, etc.) marathon difficulty is difficult to measure objectively.  I don't know of any "difficulty index" for marathons (let me know if you know of one), which is why starting with the winning times of races is a good place to start when considering racing with the potential to win.  

Sunday, December 14, 2014

Networks and Boardmembers

To start, this is a short basic exercise in social networks.

I’m not sure what being on a board of a large company entails.  In general I get the sense that you have some responsibility in how the company runs and maybe where it’s going.  I’m sure each board is different and the members have varying degrees of responsibility.  I recently saw an image of a network showing the involvement of board members in different companies.  Not sure if companies have rules on being on multiple boards but it seems that at least some don’t.  This image shows different companies and their common board members ~mid 2009.

Image from the book "Networks, Crowds, and Markets:  Reasoning About a Highly Connected World"

As you can see many companies utilize and probably think it advantageous to have a member from another company who may or may not have a common product.  Really with all these companies we’re not necessarily talking about specific products anymore.  When you become as large as these companies are, you probably think in terms of influence .  Which is why the associations of board members becomes more interesting, because the scale these people have to think on.  My thought is that perhaps the direction of influence is perhaps guided by board members in the same way product offerings would be guided by certain talent acquisition.  Here is a network of 8 companies and their board members that at least in the US (and really worldwide) are top level “influencers” (and happen to be some of the largest companies from a revenue perspective as well).  Companies as yellow vertices and board members as white vertices, with arrows indicating board affiliation of the person(s).




You’ll notice that less so than with the 2009 network, there are board members that sit on the boards of other companies in the network.  Does this say anything about those companies' strategy of influence or communicate maybe a change with their strategy when compared with 2009?  We can color the different companies based on their board members and perhaps infer on the strategic influence these companies want at least from their board members*.




Disney shares members with social media giants and Apple.  Exxon with Walmart, while Amazon and Google share with none of these (a shift from 2009).  In some ways we could view some of these companies as competitors not because of their product offering but because of the influence they want.  Which begs the question, does Amazon or Google want strategic influence/input from these other companies or do they see affiliations which were once (in 2009) helpful as now unhelpful associations?  Maybe we’re inferring too much.  Even still, the expansion of Disney recently (and previously) into multiple acquisitions does indicate that the desire to influence across multiple markets is definitely a goal.  Alternatively Apple, Facebook, and Twitter see individuals on Disney’s board as helpful for their own goals (maybe).  Either way, the board member sharing between these 4 companies is interesting not because of the fact they share board members but more so because they all have SO much influence over our social media, entertainment, and technology.  Wal-Mart and Exxon having a common member (largest company, largest energy company) may have more to do with what Mr. Reinemund brings to the table than with strategic expansion...I infer too much perhaps.

In general realizing the sharing of influential people was for me kind of sobering and remarkable in that few people can probably take on roles like these and yet these roles have so much power:  shaping the social, entertainment, and technological world.


*As a side note there are few people who can make decisions on the level of these companies so finding board members who can think like this in and of itself is probably just difficult.  So the fact that these companies share board members could be more of a function of the “pool” of individuals they have to choose from rather than directions or influence they want.  Also, board members may not really be considered agents of the companies on whose boards they sit, but just smart people that are desirable to have influence in a particular company.  The former is just an idea that makes this whole exercise more interesting.

Friday, December 5, 2014

Winning a Marathon

The proliferation and participation in the marathon has increased substantially in recent years.  No longer is the distance an event reserved for the super-athletic, but at least in the US one can from many vantage points on highways or streets see the infamous "26.2" sticker donning a rear windshield.  In a previous post I logged participation in marathons worldwide and as can be seen from the animation, certainly in the US this has increased over time.

As participation becomes more the norm we turn now to the question of actually winning a marathon.  The Association of Road Racing Statisticians (yes there is such a thing) maintains an excellent site with all sorts of data on the marathon event as well as other distances.  I created a large file from their data of all marathons each year in the world with their winners.  Marathon in this dataset is anything that is over or equal to 26.2 miles, so that includes trail races or ultra-marathons.  This will make sense when some of the finishing times are seen below.  I plan on talking more about this dataset in future posts but for now we'll look at winning a marathon in the USA.

According to this dataset, in 2013 there were 1,984 marathon events in the US (wow).  And seemingly Fall is the most popular time to host them (ya know before the Holidays).


So how fast do you need to run to win one of these or at least have a decent shot at winning?  Obviously lots of variance depending on which one - or as may be intuitive race purse/recognition is highly correlated with race speed*.  In general for the past several years in the US, the time needed for a male on average is about 3 hours.  As more races have been created giving opportunity to more people, the average time needed to win a marathon has decreased slightly.  In 2013 you "only" needed to run in the 3:30 range to win a marathon, that is on average across 1,984 races.




Interesting to note that to qualify for the Boston Marathon in 2013 as a male a time of 3 hours was needed (wonder if they based that on average winning times over the last 10 years).  Female winning times look similar in that they too have a slight bump in 2013/2014 in terms of "slower" winning times on average.




More recently across all the marathons in the USA, women are winning marathons at around the 4 hour mark.  Again, this all depends on the race one is entering.  But if you are like some of the people who run multiple marathons a year, hitting these averages gives you a decent chance at winning...especially as you heavily consider the number of participants and/or the purse involved ;-) 

For those interested, most of the code for pulling this data and the graph(s) will be on my github page.

*More challenging races (ultra-distance, trail, etc.) are included in the dataset (not all races were created equal) and perhaps more vetting on this dataset on individual races is needed to fully appreciate the finishing times.  A more vetted dataset would surely yield a lower finishing time for both male/female, however combing every race is beyond the scope of this post...maybe when I have a bit more time.